400.Can SAS Detect Hidden Fraud in Remote Work Environments?
Can SAS Detect Hidden Fraud in Remote Work Environments?
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HERE IN THIS PROJECT WE USED THESE SAS STATEMENTS —
DATA STEP | PROC SQL | PROC PRINT | PROC SGPLOT | MACROS | PROC CORR | PROC MEANS | PROC FREQ | PROC UNIVARIATE | PROC SORT | APPEND | MERGE | PROC TRANSPOSE | PROC DATASETS DELETE | DATA FUNCTIONS
TABLE OF CONTENTS
- Introduction
- Business Context
- Project Objectives
- Dataset Design &
Variables
- Raw Dataset Creation
- Data Cleaning &
Formatting
- Date Intelligence
(MDY, INTCK, INTNX)
- Character & Numeric
Functions
- Productivity &
Utilization Logic (Macros)
- Fraud Detection
Logic (Macros)
- SQL Analytics
- Statistical Analysis
- Frequency &
Distribution Analysis
- Correlation
Analysis
- Visualization (SGPLOT)
- Transpose, Append,
Set & Merge
- PROC DATASETS
- Business Insights
- Interview-Ready Talking
Points
- Conclusion
1. INTRODUCTION
Remote
work has transformed how organizations manage productivity, employee
well-being, and operational risk. Unlike office environments, remote teams
require data-driven monitoring to ensure:
- Optimal productivity
- Controlled burnout
- Fair workload distribution
- Fraud or misuse detection
This
project simulates a real corporate analytics use case using SAS.
2.BUSINESS CONTEXT
Organizations
with remote employees face challenges like:
- Over-reporting work hours
- Low output despite high
availability
- Burnout leading to attrition
- Communication gaps
- Fraudulent time reporting
Hence,
analytics teams use SAS to monitor patterns using:
- Productivity Index
- Utilization classification
- Burnout thresholds
- Statistical anomaly
detection
3.PROJECT OBJECTIVES
* Create a remote employee dataset (15+ records)
* Apply SQL,
DATA step, Macros
* Perform descriptive
& inferential statistics
*
Implement fraud detection logic
*
Demonstrate interview-level SAS expertise
4. DATASET DESIGN & VARIABLES
|
Variable |
Description |
|
Employee_ID |
Unique
employee number |
|
Employee_Name |
Employee
full name |
|
Role |
Job
role |
|
Join_Date |
Date of
joining |
|
Report_Date |
Date of
performance capture |
|
Hours_Worked |
Daily
working hours |
|
Tasks_Completed |
Tasks
completed |
|
Communication_Frequency |
Meetings/emails
count |
|
Burnout_Score |
Stress
indicator |
|
Productivity_Index |
Derived
KPI (%) |
|
Utilization_Class |
Macro-driven |
|
Fraud_Flag |
Macro-driven |
5. RAW DATASET CREATION
data remote_employees_raw;
length Employee_Name $20. Role $20. Region $15. Work_Mode $15.;
input Employee_ID Employee_Name $ Role $ Join_Date : date9. Report_Date : date9.
Hours_Worked Tasks_Completed Communication_Frequency Burnout_Score Region $
Work_Mode $;
format Join_Date Report_Date date9.;
datalines;
101 Nithish Data_Analyst 15JAN2022 01JAN2026 9 12 6 32 India Fully_Remote
102 Anjali SAS_Programmer 12FEB2021 01JAN2026 10 14 8 28 India Hybrid
103 Rahul Statistician 20MAR2020 01JAN2026 11 15 9 35 USA Fully_Remote
104 Sneha Data_Manager 01APR2019 01JAN2026 8 10 5 40 UK Fully_Remote
105 Arjun BI_Analyst 17MAY2022 01JAN2026 7 9 4 22 India Hybrid
106 Kavya Clinical_Analyst 10JUN2021 01JAN2026 9 13 7 30 India Fully_Remote
107 Amit QA_Analyst 05JUL2020 01JAN2026 12 16 10 45 USA Fully_Remote
108 Pooja Risk_Analyst 18AUG2019 01JAN2026 6 7 3 20 UK Hybrid
109 Rakesh SQL_Developer 01SEP2022 01JAN2026 8 11 6 27 India Fully_Remote
110 Neha Data_Scientist 15OCT2018 01JAN2026 10 14 8 38 USA Fully_Remote
111 Suresh ETL_Developer 01NOV2021 01JAN2026 9 12 7 29 India Hybrid
112 Kiran Report_Analyst 20DEC2020 01JAN2026 7 9 5 26 India Fully_Remote
113 Meena Validation_Lead 05JAN2019 01JAN2026 11 15 9 42 UK Fully_Remote
114 Prasad Macro_Developer 14FEB2022 01JAN2026 10 14 8 33 India Hybrid
115 Divya Safety_Analyst 30MAR2021 01JAN2026 8 10 6 25 India Fully_Remote
;
run;
proc print data=remote_employees_raw;
run;
OUTPUT:
| Obs | Employee_Name | Role | Region | Work_Mode | Employee_ID | Join_Date | Report_Date | Hours_Worked | Tasks_Completed | Communication_Frequency | Burnout_Score |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Nithish | Data_Analyst | India | Fully_Remote | 101 | 15JAN2022 | 01JAN2026 | 9 | 12 | 6 | 32 |
| 2 | Anjali | SAS_Programmer | India | Hybrid | 102 | 12FEB2021 | 01JAN2026 | 10 | 14 | 8 | 28 |
| 3 | Rahul | Statistician | USA | Fully_Remote | 103 | 20MAR2020 | 01JAN2026 | 11 | 15 | 9 | 35 |
| 4 | Sneha | Data_Manager | UK | Fully_Remote | 104 | 01APR2019 | 01JAN2026 | 8 | 10 | 5 | 40 |
| 5 | Arjun | BI_Analyst | India | Hybrid | 105 | 17MAY2022 | 01JAN2026 | 7 | 9 | 4 | 22 |
| 6 | Kavya | Clinical_Analyst | India | Fully_Remote | 106 | 10JUN2021 | 01JAN2026 | 9 | 13 | 7 | 30 |
| 7 | Amit | QA_Analyst | USA | Fully_Remote | 107 | 05JUL2020 | 01JAN2026 | 12 | 16 | 10 | 45 |
| 8 | Pooja | Risk_Analyst | UK | Hybrid | 108 | 18AUG2019 | 01JAN2026 | 6 | 7 | 3 | 20 |
| 9 | Rakesh | SQL_Developer | India | Fully_Remote | 109 | 01SEP2022 | 01JAN2026 | 8 | 11 | 6 | 27 |
| 10 | Neha | Data_Scientist | USA | Fully_Remote | 110 | 15OCT2018 | 01JAN2026 | 10 | 14 | 8 | 38 |
| 11 | Suresh | ETL_Developer | India | Hybrid | 111 | 01NOV2021 | 01JAN2026 | 9 | 12 | 7 | 29 |
| 12 | Kiran | Report_Analyst | India | Fully_Remote | 112 | 20DEC2020 | 01JAN2026 | 7 | 9 | 5 | 26 |
| 13 | Meena | Validation_Lead | UK | Fully_Remote | 113 | 05JAN2019 | 01JAN2026 | 11 | 15 | 9 | 42 |
| 14 | Prasad | Macro_Developer | India | Hybrid | 114 | 14FEB2022 | 01JAN2026 | 10 | 14 | 8 | 33 |
| 15 | Divya | Safety_Analyst | India | Fully_Remote | 115 | 30MAR2021 | 01JAN2026 | 8 | 10 | 6 | 25 |
·
Core SAS dataset creation
·
Supports formats, informats, raw ingestion
·
Industry-standard for structured data
6. DATA CLEANING & FORMATTING
data remote_employees_clean;
set remote_employees_raw;
Employee_Name = propcase(strip(Employee_Name));
Role = upcase(Role);
Region = lowcase(Region);
Work_Mode = catx('_',strip(Work_Mode));
run;
proc print data=remote_employees_clean;
run;
OUTPUT:
| Obs | Employee_Name | Role | Region | Work_Mode | Employee_ID | Join_Date | Report_Date | Hours_Worked | Tasks_Completed | Communication_Frequency | Burnout_Score |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Nithish | DATA_ANALYST | india | Fully_Remote | 101 | 15JAN2022 | 01JAN2026 | 9 | 12 | 6 | 32 |
| 2 | Anjali | SAS_PROGRAMMER | india | Hybrid | 102 | 12FEB2021 | 01JAN2026 | 10 | 14 | 8 | 28 |
| 3 | Rahul | STATISTICIAN | usa | Fully_Remote | 103 | 20MAR2020 | 01JAN2026 | 11 | 15 | 9 | 35 |
| 4 | Sneha | DATA_MANAGER | uk | Fully_Remote | 104 | 01APR2019 | 01JAN2026 | 8 | 10 | 5 | 40 |
| 5 | Arjun | BI_ANALYST | india | Hybrid | 105 | 17MAY2022 | 01JAN2026 | 7 | 9 | 4 | 22 |
| 6 | Kavya | CLINICAL_ANALYST | india | Fully_Remote | 106 | 10JUN2021 | 01JAN2026 | 9 | 13 | 7 | 30 |
| 7 | Amit | QA_ANALYST | usa | Fully_Remote | 107 | 05JUL2020 | 01JAN2026 | 12 | 16 | 10 | 45 |
| 8 | Pooja | RISK_ANALYST | uk | Hybrid | 108 | 18AUG2019 | 01JAN2026 | 6 | 7 | 3 | 20 |
| 9 | Rakesh | SQL_DEVELOPER | india | Fully_Remote | 109 | 01SEP2022 | 01JAN2026 | 8 | 11 | 6 | 27 |
| 10 | Neha | DATA_SCIENTIST | usa | Fully_Remote | 110 | 15OCT2018 | 01JAN2026 | 10 | 14 | 8 | 38 |
| 11 | Suresh | ETL_DEVELOPER | india | Hybrid | 111 | 01NOV2021 | 01JAN2026 | 9 | 12 | 7 | 29 |
| 12 | Kiran | REPORT_ANALYST | india | Fully_Remote | 112 | 20DEC2020 | 01JAN2026 | 7 | 9 | 5 | 26 |
| 13 | Meena | VALIDATION_LEAD | uk | Fully_Remote | 113 | 05JAN2019 | 01JAN2026 | 11 | 15 | 9 | 42 |
| 14 | Prasad | MACRO_DEVELOPER | india | Hybrid | 114 | 14FEB2022 | 01JAN2026 | 10 | 14 | 8 | 33 |
| 15 | Divya | SAFETY_ANALYST | india | Fully_Remote | 115 | 30MAR2021 | 01JAN2026 | 8 | 10 | 6 | 25 |
·
STRIP, TRIM → remove spaces
·
PROPCASS, UPCASE, LOWCASE
→ standardization
·
CATX → controlled concatenation
7. DATE INTELLIGENCE
data remote_dates;
set remote_employees_clean;
Tenure_Months = intck('month', Join_Date, Report_Date);
Next_Review_Date = intnx('month', Report_Date, 3, 'same');
Review_Month = month(mdy(month(Report_Date),1,year(Report_Date)));
run;
proc print data=remote_dates;
run;
OUTPUT:
| Obs | Employee_Name | Role | Region | Work_Mode | Employee_ID | Join_Date | Report_Date | Hours_Worked | Tasks_Completed | Communication_Frequency | Burnout_Score | Tenure_Months | Next_Review_Date | Review_Month |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Nithish | DATA_ANALYST | india | Fully_Remote | 101 | 15JAN2022 | 01JAN2026 | 9 | 12 | 6 | 32 | 48 | 24197 | 1 |
| 2 | Anjali | SAS_PROGRAMMER | india | Hybrid | 102 | 12FEB2021 | 01JAN2026 | 10 | 14 | 8 | 28 | 59 | 24197 | 1 |
| 3 | Rahul | STATISTICIAN | usa | Fully_Remote | 103 | 20MAR2020 | 01JAN2026 | 11 | 15 | 9 | 35 | 70 | 24197 | 1 |
| 4 | Sneha | DATA_MANAGER | uk | Fully_Remote | 104 | 01APR2019 | 01JAN2026 | 8 | 10 | 5 | 40 | 81 | 24197 | 1 |
| 5 | Arjun | BI_ANALYST | india | Hybrid | 105 | 17MAY2022 | 01JAN2026 | 7 | 9 | 4 | 22 | 44 | 24197 | 1 |
| 6 | Kavya | CLINICAL_ANALYST | india | Fully_Remote | 106 | 10JUN2021 | 01JAN2026 | 9 | 13 | 7 | 30 | 55 | 24197 | 1 |
| 7 | Amit | QA_ANALYST | usa | Fully_Remote | 107 | 05JUL2020 | 01JAN2026 | 12 | 16 | 10 | 45 | 66 | 24197 | 1 |
| 8 | Pooja | RISK_ANALYST | uk | Hybrid | 108 | 18AUG2019 | 01JAN2026 | 6 | 7 | 3 | 20 | 77 | 24197 | 1 |
| 9 | Rakesh | SQL_DEVELOPER | india | Fully_Remote | 109 | 01SEP2022 | 01JAN2026 | 8 | 11 | 6 | 27 | 40 | 24197 | 1 |
| 10 | Neha | DATA_SCIENTIST | usa | Fully_Remote | 110 | 15OCT2018 | 01JAN2026 | 10 | 14 | 8 | 38 | 87 | 24197 | 1 |
| 11 | Suresh | ETL_DEVELOPER | india | Hybrid | 111 | 01NOV2021 | 01JAN2026 | 9 | 12 | 7 | 29 | 50 | 24197 | 1 |
| 12 | Kiran | REPORT_ANALYST | india | Fully_Remote | 112 | 20DEC2020 | 01JAN2026 | 7 | 9 | 5 | 26 | 61 | 24197 | 1 |
| 13 | Meena | VALIDATION_LEAD | uk | Fully_Remote | 113 | 05JAN2019 | 01JAN2026 | 11 | 15 | 9 | 42 | 84 | 24197 | 1 |
| 14 | Prasad | MACRO_DEVELOPER | india | Hybrid | 114 | 14FEB2022 | 01JAN2026 | 10 | 14 | 8 | 33 | 47 | 24197 | 1 |
| 15 | Divya | SAFETY_ANALYST | india | Fully_Remote | 115 | 30MAR2021 | 01JAN2026 | 8 | 10 | 6 | 25 | 58 | 24197 | 1 |
·
HR analytics
·
Performance cycles
·
Payroll fraud windows
8. PRODUCTIVITY INDEX CALCULATION
data remote_productivity;
set remote_dates;
Productivity_Index = round((Tasks_Completed / Hours_Worked)*10,0.01);
run;
proc print data=remote_productivity;
run;
OUTPUT:
| Obs | Employee_Name | Role | Region | Work_Mode | Employee_ID | Join_Date | Report_Date | Hours_Worked | Tasks_Completed | Communication_Frequency | Burnout_Score | Tenure_Months | Next_Review_Date | Review_Month | Productivity_Index |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Nithish | DATA_ANALYST | india | Fully_Remote | 101 | 15JAN2022 | 01JAN2026 | 9 | 12 | 6 | 32 | 48 | 24197 | 1 | 13.33 |
| 2 | Anjali | SAS_PROGRAMMER | india | Hybrid | 102 | 12FEB2021 | 01JAN2026 | 10 | 14 | 8 | 28 | 59 | 24197 | 1 | 14.00 |
| 3 | Rahul | STATISTICIAN | usa | Fully_Remote | 103 | 20MAR2020 | 01JAN2026 | 11 | 15 | 9 | 35 | 70 | 24197 | 1 | 13.64 |
| 4 | Sneha | DATA_MANAGER | uk | Fully_Remote | 104 | 01APR2019 | 01JAN2026 | 8 | 10 | 5 | 40 | 81 | 24197 | 1 | 12.50 |
| 5 | Arjun | BI_ANALYST | india | Hybrid | 105 | 17MAY2022 | 01JAN2026 | 7 | 9 | 4 | 22 | 44 | 24197 | 1 | 12.86 |
| 6 | Kavya | CLINICAL_ANALYST | india | Fully_Remote | 106 | 10JUN2021 | 01JAN2026 | 9 | 13 | 7 | 30 | 55 | 24197 | 1 | 14.44 |
| 7 | Amit | QA_ANALYST | usa | Fully_Remote | 107 | 05JUL2020 | 01JAN2026 | 12 | 16 | 10 | 45 | 66 | 24197 | 1 | 13.33 |
| 8 | Pooja | RISK_ANALYST | uk | Hybrid | 108 | 18AUG2019 | 01JAN2026 | 6 | 7 | 3 | 20 | 77 | 24197 | 1 | 11.67 |
| 9 | Rakesh | SQL_DEVELOPER | india | Fully_Remote | 109 | 01SEP2022 | 01JAN2026 | 8 | 11 | 6 | 27 | 40 | 24197 | 1 | 13.75 |
| 10 | Neha | DATA_SCIENTIST | usa | Fully_Remote | 110 | 15OCT2018 | 01JAN2026 | 10 | 14 | 8 | 38 | 87 | 24197 | 1 | 14.00 |
| 11 | Suresh | ETL_DEVELOPER | india | Hybrid | 111 | 01NOV2021 | 01JAN2026 | 9 | 12 | 7 | 29 | 50 | 24197 | 1 | 13.33 |
| 12 | Kiran | REPORT_ANALYST | india | Fully_Remote | 112 | 20DEC2020 | 01JAN2026 | 7 | 9 | 5 | 26 | 61 | 24197 | 1 | 12.86 |
| 13 | Meena | VALIDATION_LEAD | uk | Fully_Remote | 113 | 05JAN2019 | 01JAN2026 | 11 | 15 | 9 | 42 | 84 | 24197 | 1 | 13.64 |
| 14 | Prasad | MACRO_DEVELOPER | india | Hybrid | 114 | 14FEB2022 | 01JAN2026 | 10 | 14 | 8 | 33 | 47 | 24197 | 1 | 14.00 |
| 15 | Divya | SAFETY_ANALYST | india | Fully_Remote | 115 | 30MAR2021 | 01JAN2026 | 8 | 10 | 6 | 25 | 58 | 24197 | 1 | 12.50 |
·
Reusability
·
Parameterization
·
Enterprise scalability
9. UTILIZATION CLASSIFICATION (MACRO)
%macro utilization;
data remote_utilization;
set remote_productivity;
length Utilization_Class $8.;
if Productivity_Index >= 15 then Utilization_Class='High';
else if Productivity_Index >=10 then Utilization_Class='Medium';
else Utilization_Class='Low';
run;
proc print data=remote_utilization;
run;
%mend;
%utilization;
OUTPUT:
| Obs | Employee_Name | Role | Region | Work_Mode | Employee_ID | Join_Date | Report_Date | Hours_Worked | Tasks_Completed | Communication_Frequency | Burnout_Score | Tenure_Months | Next_Review_Date | Review_Month | Productivity_Index | Utilization_Class |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Nithish | DATA_ANALYST | india | Fully_Remote | 101 | 15JAN2022 | 01JAN2026 | 9 | 12 | 6 | 32 | 48 | 24197 | 1 | 13.33 | Medium |
| 2 | Anjali | SAS_PROGRAMMER | india | Hybrid | 102 | 12FEB2021 | 01JAN2026 | 10 | 14 | 8 | 28 | 59 | 24197 | 1 | 14.00 | Medium |
| 3 | Rahul | STATISTICIAN | usa | Fully_Remote | 103 | 20MAR2020 | 01JAN2026 | 11 | 15 | 9 | 35 | 70 | 24197 | 1 | 13.64 | Medium |
| 4 | Sneha | DATA_MANAGER | uk | Fully_Remote | 104 | 01APR2019 | 01JAN2026 | 8 | 10 | 5 | 40 | 81 | 24197 | 1 | 12.50 | Medium |
| 5 | Arjun | BI_ANALYST | india | Hybrid | 105 | 17MAY2022 | 01JAN2026 | 7 | 9 | 4 | 22 | 44 | 24197 | 1 | 12.86 | Medium |
| 6 | Kavya | CLINICAL_ANALYST | india | Fully_Remote | 106 | 10JUN2021 | 01JAN2026 | 9 | 13 | 7 | 30 | 55 | 24197 | 1 | 14.44 | Medium |
| 7 | Amit | QA_ANALYST | usa | Fully_Remote | 107 | 05JUL2020 | 01JAN2026 | 12 | 16 | 10 | 45 | 66 | 24197 | 1 | 13.33 | Medium |
| 8 | Pooja | RISK_ANALYST | uk | Hybrid | 108 | 18AUG2019 | 01JAN2026 | 6 | 7 | 3 | 20 | 77 | 24197 | 1 | 11.67 | Medium |
| 9 | Rakesh | SQL_DEVELOPER | india | Fully_Remote | 109 | 01SEP2022 | 01JAN2026 | 8 | 11 | 6 | 27 | 40 | 24197 | 1 | 13.75 | Medium |
| 10 | Neha | DATA_SCIENTIST | usa | Fully_Remote | 110 | 15OCT2018 | 01JAN2026 | 10 | 14 | 8 | 38 | 87 | 24197 | 1 | 14.00 | Medium |
| 11 | Suresh | ETL_DEVELOPER | india | Hybrid | 111 | 01NOV2021 | 01JAN2026 | 9 | 12 | 7 | 29 | 50 | 24197 | 1 | 13.33 | Medium |
| 12 | Kiran | REPORT_ANALYST | india | Fully_Remote | 112 | 20DEC2020 | 01JAN2026 | 7 | 9 | 5 | 26 | 61 | 24197 | 1 | 12.86 | Medium |
| 13 | Meena | VALIDATION_LEAD | uk | Fully_Remote | 113 | 05JAN2019 | 01JAN2026 | 11 | 15 | 9 | 42 | 84 | 24197 | 1 | 13.64 | Medium |
| 14 | Prasad | MACRO_DEVELOPER | india | Hybrid | 114 | 14FEB2022 | 01JAN2026 | 10 | 14 | 8 | 33 | 47 | 24197 | 1 | 14.00 | Medium |
| 15 | Divya | SAFETY_ANALYST | india | Fully_Remote | 115 | 30MAR2021 | 01JAN2026 | 8 | 10 | 6 | 25 | 58 | 24197 | 1 | 12.50 | Medium |
10. FRAUD DETECTION LOGIC
%macro fraud;
data remote_fraud;
set remote_utilization;
if Hours_Worked > 11 and Tasks_Completed < 10 then Fraud_Flag='YES';
else Fraud_Flag='NO';
run;
proc print data=remote_fraud;
run;
%mend;
%fraud;
OUTPUT:
| Obs | Employee_Name | Role | Region | Work_Mode | Employee_ID | Join_Date | Report_Date | Hours_Worked | Tasks_Completed | Communication_Frequency | Burnout_Score | Tenure_Months | Next_Review_Date | Review_Month | Productivity_Index | Utilization_Class | Fraud_Flag |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Nithish | DATA_ANALYST | india | Fully_Remote | 101 | 15JAN2022 | 01JAN2026 | 9 | 12 | 6 | 32 | 48 | 24197 | 1 | 13.33 | Medium | NO |
| 2 | Anjali | SAS_PROGRAMMER | india | Hybrid | 102 | 12FEB2021 | 01JAN2026 | 10 | 14 | 8 | 28 | 59 | 24197 | 1 | 14.00 | Medium | NO |
| 3 | Rahul | STATISTICIAN | usa | Fully_Remote | 103 | 20MAR2020 | 01JAN2026 | 11 | 15 | 9 | 35 | 70 | 24197 | 1 | 13.64 | Medium | NO |
| 4 | Sneha | DATA_MANAGER | uk | Fully_Remote | 104 | 01APR2019 | 01JAN2026 | 8 | 10 | 5 | 40 | 81 | 24197 | 1 | 12.50 | Medium | NO |
| 5 | Arjun | BI_ANALYST | india | Hybrid | 105 | 17MAY2022 | 01JAN2026 | 7 | 9 | 4 | 22 | 44 | 24197 | 1 | 12.86 | Medium | NO |
| 6 | Kavya | CLINICAL_ANALYST | india | Fully_Remote | 106 | 10JUN2021 | 01JAN2026 | 9 | 13 | 7 | 30 | 55 | 24197 | 1 | 14.44 | Medium | NO |
| 7 | Amit | QA_ANALYST | usa | Fully_Remote | 107 | 05JUL2020 | 01JAN2026 | 12 | 16 | 10 | 45 | 66 | 24197 | 1 | 13.33 | Medium | NO |
| 8 | Pooja | RISK_ANALYST | uk | Hybrid | 108 | 18AUG2019 | 01JAN2026 | 6 | 7 | 3 | 20 | 77 | 24197 | 1 | 11.67 | Medium | NO |
| 9 | Rakesh | SQL_DEVELOPER | india | Fully_Remote | 109 | 01SEP2022 | 01JAN2026 | 8 | 11 | 6 | 27 | 40 | 24197 | 1 | 13.75 | Medium | NO |
| 10 | Neha | DATA_SCIENTIST | usa | Fully_Remote | 110 | 15OCT2018 | 01JAN2026 | 10 | 14 | 8 | 38 | 87 | 24197 | 1 | 14.00 | Medium | NO |
| 11 | Suresh | ETL_DEVELOPER | india | Hybrid | 111 | 01NOV2021 | 01JAN2026 | 9 | 12 | 7 | 29 | 50 | 24197 | 1 | 13.33 | Medium | NO |
| 12 | Kiran | REPORT_ANALYST | india | Fully_Remote | 112 | 20DEC2020 | 01JAN2026 | 7 | 9 | 5 | 26 | 61 | 24197 | 1 | 12.86 | Medium | NO |
| 13 | Meena | VALIDATION_LEAD | uk | Fully_Remote | 113 | 05JAN2019 | 01JAN2026 | 11 | 15 | 9 | 42 | 84 | 24197 | 1 | 13.64 | Medium | NO |
| 14 | Prasad | MACRO_DEVELOPER | india | Hybrid | 114 | 14FEB2022 | 01JAN2026 | 10 | 14 | 8 | 33 | 47 | 24197 | 1 | 14.00 | Medium | NO |
| 15 | Divya | SAFETY_ANALYST | india | Fully_Remote | 115 | 30MAR2021 | 01JAN2026 | 8 | 10 | 6 | 25 | 58 | 24197 | 1 | 12.50 | Medium | NO |
·
High hours + low output
·
Burnout anomalies
·
Communication mismatch
11. PROC SQL ANALYTICS
proc sql;
create table role_summary as
select Role,
avg(Productivity_Index) as Avg_Productivity,
avg(Burnout_Score) as Avg_Burnout
from remote_fraud
group by Role;
quit;
proc print data=role_summary;
run;
OUTPUT:
| Obs | Role | Avg_Productivity | Avg_Burnout |
|---|---|---|---|
| 1 | BI_ANALYST | 12.86 | 22 |
| 2 | CLINICAL_ANALYST | 14.44 | 30 |
| 3 | DATA_ANALYST | 13.33 | 32 |
| 4 | DATA_MANAGER | 12.50 | 40 |
| 5 | DATA_SCIENTIST | 14.00 | 38 |
| 6 | ETL_DEVELOPER | 13.33 | 29 |
| 7 | MACRO_DEVELOPER | 14.00 | 33 |
| 8 | QA_ANALYST | 13.33 | 45 |
| 9 | REPORT_ANALYST | 12.86 | 26 |
| 10 | RISK_ANALYST | 11.67 | 20 |
| 11 | SAFETY_ANALYST | 12.50 | 25 |
| 12 | SAS_PROGRAMMER | 14.00 | 28 |
| 13 | SQL_DEVELOPER | 13.75 | 27 |
| 14 | STATISTICIAN | 13.64 | 35 |
| 15 | VALIDATION_LEAD | 13.64 | 42 |
12. STATISTICAL ANALYSIS
proc means data=remote_fraud mean min max std;
var Hours_Worked Tasks_Completed Productivity_Index Burnout_Score;
run;
OUTPUT:
The MEANS Procedure
| Variable | Mean | Minimum | Maximum | Std Dev |
|---|---|---|---|---|
Hours_Worked Tasks_Completed Productivity_Index Burnout_Score | 9.0000000 12.0666667 13.3233333 31.4666667 | 6.0000000 7.0000000 11.6700000 20.0000000 | 12.0000000 16.0000000 14.4400000 45.0000000 | 1.6903085 2.6583203 0.7329945 7.3568886 |
proc univariate data=remote_fraud;
var Productivity_Index;
run;
OUTPUT:
The UNIVARIATE Procedure
Variable: Productivity_Index
| Moments | |||
|---|---|---|---|
| N | 15 | Sum Weights | 15 |
| Mean | 13.3233333 | Sum Observations | 199.85 |
| Std Deviation | 0.73299451 | Variance | 0.53728095 |
| Skewness | -0.7055677 | Kurtosis | 0.31175916 |
| Uncorrected SS | 2670.1901 | Corrected SS | 7.52193333 |
| Coeff Variation | 5.50158502 | Std Error Mean | 0.18925837 |
| Basic Statistical Measures | |||
|---|---|---|---|
| Location | Variability | ||
| Mean | 13.32333 | Std Deviation | 0.73299 |
| Median | 13.33000 | Variance | 0.53728 |
| Mode | 13.33000 | Range | 2.77000 |
| Interquartile Range | 1.14000 | ||
Note: The mode displayed is the smallest of 2 modes with a count of 3.
| Tests for Location: Mu0=0 | ||||
|---|---|---|---|---|
| Test | Statistic | p Value | ||
| Student's t | t | 70.39759 | Pr > |t| | <.0001 |
| Sign | M | 7.5 | Pr >= |M| | <.0001 |
| Signed Rank | S | 60 | Pr >= |S| | <.0001 |
| Quantiles (Definition 5) | |
|---|---|
| Level | Quantile |
| 100% Max | 14.44 |
| 99% | 14.44 |
| 95% | 14.44 |
| 90% | 14.00 |
| 75% Q3 | 14.00 |
| 50% Median | 13.33 |
| 25% Q1 | 12.86 |
| 10% | 12.50 |
| 5% | 11.67 |
| 1% | 11.67 |
| 0% Min | 11.67 |
| Extreme Observations | |||
|---|---|---|---|
| Lowest | Highest | ||
| Value | Obs | Value | Obs |
| 11.67 | 8 | 13.75 | 9 |
| 12.50 | 15 | 14.00 | 2 |
| 12.50 | 4 | 14.00 | 10 |
| 12.86 | 12 | 14.00 | 14 |
| 12.86 | 5 | 14.44 | 6 |
13. FREQUENCY ANALYSIS
proc freq data=remote_fraud;
tables Utilization_Class Fraud_Flag Work_Mode;
run;
OUTPUT:
The FREQ Procedure
| Utilization_Class | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| Medium | 15 | 100.00 | 15 | 100.00 |
| Fraud_Flag | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| NO | 15 | 100.00 | 15 | 100.00 |
| Work_Mode | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| Fully_Remote | 10 | 66.67 | 10 | 66.67 |
| Hybrid | 5 | 33.33 | 15 | 100.00 |
14. CORRELATION ANALYSIS
proc corr data=remote_fraud;
var Hours_Worked Tasks_Completed Burnout_Score Productivity_Index;
run;
OUTPUT:
The CORR Procedure
| 4 Variables: | Hours_Worked Tasks_Completed Burnout_Score Productivity_Index |
|---|
| Simple Statistics | ||||||
|---|---|---|---|---|---|---|
| Variable | N | Mean | Std Dev | Sum | Minimum | Maximum |
| Hours_Worked | 15 | 9.00000 | 1.69031 | 135.00000 | 6.00000 | 12.00000 |
| Tasks_Completed | 15 | 12.06667 | 2.65832 | 181.00000 | 7.00000 | 16.00000 |
| Burnout_Score | 15 | 31.46667 | 7.35689 | 472.00000 | 20.00000 | 45.00000 |
| Productivity_Index | 15 | 13.32333 | 0.73299 | 199.85000 | 11.67000 | 14.44000 |
| Pearson Correlation Coefficients, N = 15 Prob > |r| under H0: Rho=0 | ||||
|---|---|---|---|---|
| Hours_Worked | Tasks_Completed | Burnout_Score | Productivity_Index | |
| Hours_Worked | 1.00000 | 0.98558 <.0001 | 0.80416 0.0003 | 0.65434 0.0081 |
| Tasks_Completed | 0.98558 <.0001 | 1.00000 | 0.75433 0.0012 | 0.77042 0.0008 |
| Burnout_Score | 0.80416 0.0003 | 0.75433 0.0012 | 1.00000 | 0.37455 0.1690 |
| Productivity_Index | 0.65434 0.0081 | 0.77042 0.0008 | 0.37455 0.1690 | 1.00000 |
15. VISUALIZATION
proc sgplot data=remote_fraud;
scatter x=Hours_Worked y=Productivity_Index;
run;
OUTPUT:
16. TRANSPOSE
proc transpose data=remote_fraud out=prod_t;
var Productivity_Index Burnout_Score;
by Employee_ID;
run;
proc print data=prod_t;
run;
OUTPUT:
| Obs | Employee_ID | _NAME_ | COL1 |
|---|---|---|---|
| 1 | 101 | Productivity_Index | 13.33 |
| 2 | 101 | Burnout_Score | 32.00 |
| 3 | 102 | Productivity_Index | 14.00 |
| 4 | 102 | Burnout_Score | 28.00 |
| 5 | 103 | Productivity_Index | 13.64 |
| 6 | 103 | Burnout_Score | 35.00 |
| 7 | 104 | Productivity_Index | 12.50 |
| 8 | 104 | Burnout_Score | 40.00 |
| 9 | 105 | Productivity_Index | 12.86 |
| 10 | 105 | Burnout_Score | 22.00 |
| 11 | 106 | Productivity_Index | 14.44 |
| 12 | 106 | Burnout_Score | 30.00 |
| 13 | 107 | Productivity_Index | 13.33 |
| 14 | 107 | Burnout_Score | 45.00 |
| 15 | 108 | Productivity_Index | 11.67 |
| 16 | 108 | Burnout_Score | 20.00 |
| 17 | 109 | Productivity_Index | 13.75 |
| 18 | 109 | Burnout_Score | 27.00 |
| 19 | 110 | Productivity_Index | 14.00 |
| 20 | 110 | Burnout_Score | 38.00 |
| 21 | 111 | Productivity_Index | 13.33 |
| 22 | 111 | Burnout_Score | 29.00 |
| 23 | 112 | Productivity_Index | 12.86 |
| 24 | 112 | Burnout_Score | 26.00 |
| 25 | 113 | Productivity_Index | 13.64 |
| 26 | 113 | Burnout_Score | 42.00 |
| 27 | 114 | Productivity_Index | 14.00 |
| 28 | 114 | Burnout_Score | 33.00 |
| 29 | 115 | Productivity_Index | 12.50 |
| 30 | 115 | Burnout_Score | 25.00 |
17. CREATE A NEW DATASET,APPEND
data remote_fraud_new;
set remote_fraud;
Report_Date = intnx('month', Report_Date, 1, 'same');
format Report_Date date9.;
run;
proc print data=remote_fraud_new;
run;
OUTPUT:
| Obs | Employee_Name | Role | Region | Work_Mode | Employee_ID | Join_Date | Report_Date | Hours_Worked | Tasks_Completed | Communication_Frequency | Burnout_Score | Tenure_Months | Next_Review_Date | Review_Month | Productivity_Index | Utilization_Class | Fraud_Flag |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Nithish | DATA_ANALYST | india | Fully_Remote | 101 | 15JAN2022 | 01FEB2026 | 9 | 12 | 6 | 32 | 48 | 24197 | 1 | 13.33 | Medium | NO |
| 2 | Anjali | SAS_PROGRAMMER | india | Hybrid | 102 | 12FEB2021 | 01FEB2026 | 10 | 14 | 8 | 28 | 59 | 24197 | 1 | 14.00 | Medium | NO |
| 3 | Rahul | STATISTICIAN | usa | Fully_Remote | 103 | 20MAR2020 | 01FEB2026 | 11 | 15 | 9 | 35 | 70 | 24197 | 1 | 13.64 | Medium | NO |
| 4 | Sneha | DATA_MANAGER | uk | Fully_Remote | 104 | 01APR2019 | 01FEB2026 | 8 | 10 | 5 | 40 | 81 | 24197 | 1 | 12.50 | Medium | NO |
| 5 | Arjun | BI_ANALYST | india | Hybrid | 105 | 17MAY2022 | 01FEB2026 | 7 | 9 | 4 | 22 | 44 | 24197 | 1 | 12.86 | Medium | NO |
| 6 | Kavya | CLINICAL_ANALYST | india | Fully_Remote | 106 | 10JUN2021 | 01FEB2026 | 9 | 13 | 7 | 30 | 55 | 24197 | 1 | 14.44 | Medium | NO |
| 7 | Amit | QA_ANALYST | usa | Fully_Remote | 107 | 05JUL2020 | 01FEB2026 | 12 | 16 | 10 | 45 | 66 | 24197 | 1 | 13.33 | Medium | NO |
| 8 | Pooja | RISK_ANALYST | uk | Hybrid | 108 | 18AUG2019 | 01FEB2026 | 6 | 7 | 3 | 20 | 77 | 24197 | 1 | 11.67 | Medium | NO |
| 9 | Rakesh | SQL_DEVELOPER | india | Fully_Remote | 109 | 01SEP2022 | 01FEB2026 | 8 | 11 | 6 | 27 | 40 | 24197 | 1 | 13.75 | Medium | NO |
| 10 | Neha | DATA_SCIENTIST | usa | Fully_Remote | 110 | 15OCT2018 | 01FEB2026 | 10 | 14 | 8 | 38 | 87 | 24197 | 1 | 14.00 | Medium | NO |
| 11 | Suresh | ETL_DEVELOPER | india | Hybrid | 111 | 01NOV2021 | 01FEB2026 | 9 | 12 | 7 | 29 | 50 | 24197 | 1 | 13.33 | Medium | NO |
| 12 | Kiran | REPORT_ANALYST | india | Fully_Remote | 112 | 20DEC2020 | 01FEB2026 | 7 | 9 | 5 | 26 | 61 | 24197 | 1 | 12.86 | Medium | NO |
| 13 | Meena | VALIDATION_LEAD | uk | Fully_Remote | 113 | 05JAN2019 | 01FEB2026 | 11 | 15 | 9 | 42 | 84 | 24197 | 1 | 13.64 | Medium | NO |
| 14 | Prasad | MACRO_DEVELOPER | india | Hybrid | 114 | 14FEB2022 | 01FEB2026 | 10 | 14 | 8 | 33 | 47 | 24197 | 1 | 14.00 | Medium | NO |
| 15 | Divya | SAFETY_ANALYST | india | Fully_Remote | 115 | 30MAR2021 | 01FEB2026 | 8 | 10 | 6 | 25 | 58 | 24197 | 1 | 12.50 | Medium | NO |
proc append base=remote_fraud
data=remote_fraud_new force;
run;
proc print data=remote_fraud;
run;
OUTPUT:
| Obs | Employee_Name | Role | Region | Work_Mode | Employee_ID | Join_Date | Report_Date | Hours_Worked | Tasks_Completed | Communication_Frequency | Burnout_Score | Tenure_Months | Next_Review_Date | Review_Month | Productivity_Index | Utilization_Class | Fraud_Flag |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Nithish | DATA_ANALYST | india | Fully_Remote | 101 | 15JAN2022 | 01JAN2026 | 9 | 12 | 6 | 32 | 48 | 24197 | 1 | 13.33 | Medium | NO |
| 2 | Anjali | SAS_PROGRAMMER | india | Hybrid | 102 | 12FEB2021 | 01JAN2026 | 10 | 14 | 8 | 28 | 59 | 24197 | 1 | 14.00 | Medium | NO |
| 3 | Rahul | STATISTICIAN | usa | Fully_Remote | 103 | 20MAR2020 | 01JAN2026 | 11 | 15 | 9 | 35 | 70 | 24197 | 1 | 13.64 | Medium | NO |
| 4 | Sneha | DATA_MANAGER | uk | Fully_Remote | 104 | 01APR2019 | 01JAN2026 | 8 | 10 | 5 | 40 | 81 | 24197 | 1 | 12.50 | Medium | NO |
| 5 | Arjun | BI_ANALYST | india | Hybrid | 105 | 17MAY2022 | 01JAN2026 | 7 | 9 | 4 | 22 | 44 | 24197 | 1 | 12.86 | Medium | NO |
| 6 | Kavya | CLINICAL_ANALYST | india | Fully_Remote | 106 | 10JUN2021 | 01JAN2026 | 9 | 13 | 7 | 30 | 55 | 24197 | 1 | 14.44 | Medium | NO |
| 7 | Amit | QA_ANALYST | usa | Fully_Remote | 107 | 05JUL2020 | 01JAN2026 | 12 | 16 | 10 | 45 | 66 | 24197 | 1 | 13.33 | Medium | NO |
| 8 | Pooja | RISK_ANALYST | uk | Hybrid | 108 | 18AUG2019 | 01JAN2026 | 6 | 7 | 3 | 20 | 77 | 24197 | 1 | 11.67 | Medium | NO |
| 9 | Rakesh | SQL_DEVELOPER | india | Fully_Remote | 109 | 01SEP2022 | 01JAN2026 | 8 | 11 | 6 | 27 | 40 | 24197 | 1 | 13.75 | Medium | NO |
| 10 | Neha | DATA_SCIENTIST | usa | Fully_Remote | 110 | 15OCT2018 | 01JAN2026 | 10 | 14 | 8 | 38 | 87 | 24197 | 1 | 14.00 | Medium | NO |
| 11 | Suresh | ETL_DEVELOPER | india | Hybrid | 111 | 01NOV2021 | 01JAN2026 | 9 | 12 | 7 | 29 | 50 | 24197 | 1 | 13.33 | Medium | NO |
| 12 | Kiran | REPORT_ANALYST | india | Fully_Remote | 112 | 20DEC2020 | 01JAN2026 | 7 | 9 | 5 | 26 | 61 | 24197 | 1 | 12.86 | Medium | NO |
| 13 | Meena | VALIDATION_LEAD | uk | Fully_Remote | 113 | 05JAN2019 | 01JAN2026 | 11 | 15 | 9 | 42 | 84 | 24197 | 1 | 13.64 | Medium | NO |
| 14 | Prasad | MACRO_DEVELOPER | india | Hybrid | 114 | 14FEB2022 | 01JAN2026 | 10 | 14 | 8 | 33 | 47 | 24197 | 1 | 14.00 | Medium | NO |
| 15 | Divya | SAFETY_ANALYST | india | Fully_Remote | 115 | 30MAR2021 | 01JAN2026 | 8 | 10 | 6 | 25 | 58 | 24197 | 1 | 12.50 | Medium | NO |
| 16 | Nithish | DATA_ANALYST | india | Fully_Remote | 101 | 15JAN2022 | 01FEB2026 | 9 | 12 | 6 | 32 | 48 | 24197 | 1 | 13.33 | Medium | NO |
| 17 | Anjali | SAS_PROGRAMMER | india | Hybrid | 102 | 12FEB2021 | 01FEB2026 | 10 | 14 | 8 | 28 | 59 | 24197 | 1 | 14.00 | Medium | NO |
| 18 | Rahul | STATISTICIAN | usa | Fully_Remote | 103 | 20MAR2020 | 01FEB2026 | 11 | 15 | 9 | 35 | 70 | 24197 | 1 | 13.64 | Medium | NO |
| 19 | Sneha | DATA_MANAGER | uk | Fully_Remote | 104 | 01APR2019 | 01FEB2026 | 8 | 10 | 5 | 40 | 81 | 24197 | 1 | 12.50 | Medium | NO |
| 20 | Arjun | BI_ANALYST | india | Hybrid | 105 | 17MAY2022 | 01FEB2026 | 7 | 9 | 4 | 22 | 44 | 24197 | 1 | 12.86 | Medium | NO |
| 21 | Kavya | CLINICAL_ANALYST | india | Fully_Remote | 106 | 10JUN2021 | 01FEB2026 | 9 | 13 | 7 | 30 | 55 | 24197 | 1 | 14.44 | Medium | NO |
| 22 | Amit | QA_ANALYST | usa | Fully_Remote | 107 | 05JUL2020 | 01FEB2026 | 12 | 16 | 10 | 45 | 66 | 24197 | 1 | 13.33 | Medium | NO |
| 23 | Pooja | RISK_ANALYST | uk | Hybrid | 108 | 18AUG2019 | 01FEB2026 | 6 | 7 | 3 | 20 | 77 | 24197 | 1 | 11.67 | Medium | NO |
| 24 | Rakesh | SQL_DEVELOPER | india | Fully_Remote | 109 | 01SEP2022 | 01FEB2026 | 8 | 11 | 6 | 27 | 40 | 24197 | 1 | 13.75 | Medium | NO |
| 25 | Neha | DATA_SCIENTIST | usa | Fully_Remote | 110 | 15OCT2018 | 01FEB2026 | 10 | 14 | 8 | 38 | 87 | 24197 | 1 | 14.00 | Medium | NO |
| 26 | Suresh | ETL_DEVELOPER | india | Hybrid | 111 | 01NOV2021 | 01FEB2026 | 9 | 12 | 7 | 29 | 50 | 24197 | 1 | 13.33 | Medium | NO |
| 27 | Kiran | REPORT_ANALYST | india | Fully_Remote | 112 | 20DEC2020 | 01FEB2026 | 7 | 9 | 5 | 26 | 61 | 24197 | 1 | 12.86 | Medium | NO |
| 28 | Meena | VALIDATION_LEAD | uk | Fully_Remote | 113 | 05JAN2019 | 01FEB2026 | 11 | 15 | 9 | 42 | 84 | 24197 | 1 | 13.64 | Medium | NO |
| 29 | Prasad | MACRO_DEVELOPER | india | Hybrid | 114 | 14FEB2022 | 01FEB2026 | 10 | 14 | 8 | 33 | 47 | 24197 | 1 | 14.00 | Medium | NO |
| 30 | Divya | SAFETY_ANALYST | india | Fully_Remote | 115 | 30MAR2021 | 01FEB2026 | 8 | 10 | 6 | 25 | 58 | 24197 | 1 | 12.50 | Medium | NO |
run;
OUTPUT:
| Obs | Employee_Name | Role | Region | Work_Mode | Employee_ID | Join_Date | Report_Date | Hours_Worked | Tasks_Completed | Communication_Frequency | Burnout_Score | Tenure_Months | Next_Review_Date | Review_Month | Productivity_Index | Utilization_Class | Fraud_Flag |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Arjun | BI_ANALYST | india | Hybrid | 105 | 17MAY2022 | 01JAN2026 | 7 | 9 | 4 | 22 | 44 | 24197 | 1 | 12.86 | Medium | NO |
| 2 | Arjun | BI_ANALYST | india | Hybrid | 105 | 17MAY2022 | 01FEB2026 | 7 | 9 | 4 | 22 | 44 | 24197 | 1 | 12.86 | Medium | NO |
| 3 | Kavya | CLINICAL_ANALYST | india | Fully_Remote | 106 | 10JUN2021 | 01JAN2026 | 9 | 13 | 7 | 30 | 55 | 24197 | 1 | 14.44 | Medium | NO |
| 4 | Kavya | CLINICAL_ANALYST | india | Fully_Remote | 106 | 10JUN2021 | 01FEB2026 | 9 | 13 | 7 | 30 | 55 | 24197 | 1 | 14.44 | Medium | NO |
| 5 | Nithish | DATA_ANALYST | india | Fully_Remote | 101 | 15JAN2022 | 01JAN2026 | 9 | 12 | 6 | 32 | 48 | 24197 | 1 | 13.33 | Medium | NO |
| 6 | Nithish | DATA_ANALYST | india | Fully_Remote | 101 | 15JAN2022 | 01FEB2026 | 9 | 12 | 6 | 32 | 48 | 24197 | 1 | 13.33 | Medium | NO |
| 7 | Sneha | DATA_MANAGER | uk | Fully_Remote | 104 | 01APR2019 | 01JAN2026 | 8 | 10 | 5 | 40 | 81 | 24197 | 1 | 12.50 | Medium | NO |
| 8 | Sneha | DATA_MANAGER | uk | Fully_Remote | 104 | 01APR2019 | 01FEB2026 | 8 | 10 | 5 | 40 | 81 | 24197 | 1 | 12.50 | Medium | NO |
| 9 | Neha | DATA_SCIENTIST | usa | Fully_Remote | 110 | 15OCT2018 | 01JAN2026 | 10 | 14 | 8 | 38 | 87 | 24197 | 1 | 14.00 | Medium | NO |
| 10 | Neha | DATA_SCIENTIST | usa | Fully_Remote | 110 | 15OCT2018 | 01FEB2026 | 10 | 14 | 8 | 38 | 87 | 24197 | 1 | 14.00 | Medium | NO |
| 11 | Suresh | ETL_DEVELOPER | india | Hybrid | 111 | 01NOV2021 | 01JAN2026 | 9 | 12 | 7 | 29 | 50 | 24197 | 1 | 13.33 | Medium | NO |
| 12 | Suresh | ETL_DEVELOPER | india | Hybrid | 111 | 01NOV2021 | 01FEB2026 | 9 | 12 | 7 | 29 | 50 | 24197 | 1 | 13.33 | Medium | NO |
| 13 | Prasad | MACRO_DEVELOPER | india | Hybrid | 114 | 14FEB2022 | 01JAN2026 | 10 | 14 | 8 | 33 | 47 | 24197 | 1 | 14.00 | Medium | NO |
| 14 | Prasad | MACRO_DEVELOPER | india | Hybrid | 114 | 14FEB2022 | 01FEB2026 | 10 | 14 | 8 | 33 | 47 | 24197 | 1 | 14.00 | Medium | NO |
| 15 | Amit | QA_ANALYST | usa | Fully_Remote | 107 | 05JUL2020 | 01JAN2026 | 12 | 16 | 10 | 45 | 66 | 24197 | 1 | 13.33 | Medium | NO |
| 16 | Amit | QA_ANALYST | usa | Fully_Remote | 107 | 05JUL2020 | 01FEB2026 | 12 | 16 | 10 | 45 | 66 | 24197 | 1 | 13.33 | Medium | NO |
| 17 | Kiran | REPORT_ANALYST | india | Fully_Remote | 112 | 20DEC2020 | 01JAN2026 | 7 | 9 | 5 | 26 | 61 | 24197 | 1 | 12.86 | Medium | NO |
| 18 | Kiran | REPORT_ANALYST | india | Fully_Remote | 112 | 20DEC2020 | 01FEB2026 | 7 | 9 | 5 | 26 | 61 | 24197 | 1 | 12.86 | Medium | NO |
| 19 | Pooja | RISK_ANALYST | uk | Hybrid | 108 | 18AUG2019 | 01JAN2026 | 6 | 7 | 3 | 20 | 77 | 24197 | 1 | 11.67 | Medium | NO |
| 20 | Pooja | RISK_ANALYST | uk | Hybrid | 108 | 18AUG2019 | 01FEB2026 | 6 | 7 | 3 | 20 | 77 | 24197 | 1 | 11.67 | Medium | NO |
| 21 | Divya | SAFETY_ANALYST | india | Fully_Remote | 115 | 30MAR2021 | 01JAN2026 | 8 | 10 | 6 | 25 | 58 | 24197 | 1 | 12.50 | Medium | NO |
| 22 | Divya | SAFETY_ANALYST | india | Fully_Remote | 115 | 30MAR2021 | 01FEB2026 | 8 | 10 | 6 | 25 | 58 | 24197 | 1 | 12.50 | Medium | NO |
| 23 | Anjali | SAS_PROGRAMMER | india | Hybrid | 102 | 12FEB2021 | 01JAN2026 | 10 | 14 | 8 | 28 | 59 | 24197 | 1 | 14.00 | Medium | NO |
| 24 | Anjali | SAS_PROGRAMMER | india | Hybrid | 102 | 12FEB2021 | 01FEB2026 | 10 | 14 | 8 | 28 | 59 | 24197 | 1 | 14.00 | Medium | NO |
| 25 | Rakesh | SQL_DEVELOPER | india | Fully_Remote | 109 | 01SEP2022 | 01JAN2026 | 8 | 11 | 6 | 27 | 40 | 24197 | 1 | 13.75 | Medium | NO |
| 26 | Rakesh | SQL_DEVELOPER | india | Fully_Remote | 109 | 01SEP2022 | 01FEB2026 | 8 | 11 | 6 | 27 | 40 | 24197 | 1 | 13.75 | Medium | NO |
| 27 | Rahul | STATISTICIAN | usa | Fully_Remote | 103 | 20MAR2020 | 01JAN2026 | 11 | 15 | 9 | 35 | 70 | 24197 | 1 | 13.64 | Medium | NO |
| 28 | Rahul | STATISTICIAN | usa | Fully_Remote | 103 | 20MAR2020 | 01FEB2026 | 11 | 15 | 9 | 35 | 70 | 24197 | 1 | 13.64 | Medium | NO |
| 29 | Meena | VALIDATION_LEAD | uk | Fully_Remote | 113 | 05JAN2019 | 01JAN2026 | 11 | 15 | 9 | 42 | 84 | 24197 | 1 | 13.64 | Medium | NO |
| 30 | Meena | VALIDATION_LEAD | uk | Fully_Remote | 113 | 05JAN2019 | 01FEB2026 | 11 | 15 | 9 | 42 | 84 | 24197 | 1 | 13.64 | Medium | NO |
run;
OUTPUT:
| Obs | Role | Avg_Productivity | Avg_Burnout |
|---|---|---|---|
| 1 | BI_ANALYST | 12.86 | 22 |
| 2 | CLINICAL_ANALYST | 14.44 | 30 |
| 3 | DATA_ANALYST | 13.33 | 32 |
| 4 | DATA_MANAGER | 12.50 | 40 |
| 5 | DATA_SCIENTIST | 14.00 | 38 |
| 6 | ETL_DEVELOPER | 13.33 | 29 |
| 7 | MACRO_DEVELOPER | 14.00 | 33 |
| 8 | QA_ANALYST | 13.33 | 45 |
| 9 | REPORT_ANALYST | 12.86 | 26 |
| 10 | RISK_ANALYST | 11.67 | 20 |
| 11 | SAFETY_ANALYST | 12.50 | 25 |
| 12 | SAS_PROGRAMMER | 14.00 | 28 |
| 13 | SQL_DEVELOPER | 13.75 | 27 |
| 14 | STATISTICIAN | 13.64 | 35 |
| 15 | VALIDATION_LEAD | 13.64 | 42 |
data merged;
merge remote_fraud
role_summary;
by Role;
run;
proc print data=merged;
run;
OUTPUT:
| Obs | Employee_Name | Role | Region | Work_Mode | Employee_ID | Join_Date | Report_Date | Hours_Worked | Tasks_Completed | Communication_Frequency | Burnout_Score | Tenure_Months | Next_Review_Date | Review_Month | Productivity_Index | Utilization_Class | Fraud_Flag | Avg_Productivity | Avg_Burnout |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Arjun | BI_ANALYST | india | Hybrid | 105 | 17MAY2022 | 01JAN2026 | 7 | 9 | 4 | 22 | 44 | 24197 | 1 | 12.86 | Medium | NO | 12.86 | 22 |
| 2 | Arjun | BI_ANALYST | india | Hybrid | 105 | 17MAY2022 | 01FEB2026 | 7 | 9 | 4 | 22 | 44 | 24197 | 1 | 12.86 | Medium | NO | 12.86 | 22 |
| 3 | Kavya | CLINICAL_ANALYST | india | Fully_Remote | 106 | 10JUN2021 | 01JAN2026 | 9 | 13 | 7 | 30 | 55 | 24197 | 1 | 14.44 | Medium | NO | 14.44 | 30 |
| 4 | Kavya | CLINICAL_ANALYST | india | Fully_Remote | 106 | 10JUN2021 | 01FEB2026 | 9 | 13 | 7 | 30 | 55 | 24197 | 1 | 14.44 | Medium | NO | 14.44 | 30 |
| 5 | Nithish | DATA_ANALYST | india | Fully_Remote | 101 | 15JAN2022 | 01JAN2026 | 9 | 12 | 6 | 32 | 48 | 24197 | 1 | 13.33 | Medium | NO | 13.33 | 32 |
| 6 | Nithish | DATA_ANALYST | india | Fully_Remote | 101 | 15JAN2022 | 01FEB2026 | 9 | 12 | 6 | 32 | 48 | 24197 | 1 | 13.33 | Medium | NO | 13.33 | 32 |
| 7 | Sneha | DATA_MANAGER | uk | Fully_Remote | 104 | 01APR2019 | 01JAN2026 | 8 | 10 | 5 | 40 | 81 | 24197 | 1 | 12.50 | Medium | NO | 12.50 | 40 |
| 8 | Sneha | DATA_MANAGER | uk | Fully_Remote | 104 | 01APR2019 | 01FEB2026 | 8 | 10 | 5 | 40 | 81 | 24197 | 1 | 12.50 | Medium | NO | 12.50 | 40 |
| 9 | Neha | DATA_SCIENTIST | usa | Fully_Remote | 110 | 15OCT2018 | 01JAN2026 | 10 | 14 | 8 | 38 | 87 | 24197 | 1 | 14.00 | Medium | NO | 14.00 | 38 |
| 10 | Neha | DATA_SCIENTIST | usa | Fully_Remote | 110 | 15OCT2018 | 01FEB2026 | 10 | 14 | 8 | 38 | 87 | 24197 | 1 | 14.00 | Medium | NO | 14.00 | 38 |
| 11 | Suresh | ETL_DEVELOPER | india | Hybrid | 111 | 01NOV2021 | 01JAN2026 | 9 | 12 | 7 | 29 | 50 | 24197 | 1 | 13.33 | Medium | NO | 13.33 | 29 |
| 12 | Suresh | ETL_DEVELOPER | india | Hybrid | 111 | 01NOV2021 | 01FEB2026 | 9 | 12 | 7 | 29 | 50 | 24197 | 1 | 13.33 | Medium | NO | 13.33 | 29 |
| 13 | Prasad | MACRO_DEVELOPER | india | Hybrid | 114 | 14FEB2022 | 01JAN2026 | 10 | 14 | 8 | 33 | 47 | 24197 | 1 | 14.00 | Medium | NO | 14.00 | 33 |
| 14 | Prasad | MACRO_DEVELOPER | india | Hybrid | 114 | 14FEB2022 | 01FEB2026 | 10 | 14 | 8 | 33 | 47 | 24197 | 1 | 14.00 | Medium | NO | 14.00 | 33 |
| 15 | Amit | QA_ANALYST | usa | Fully_Remote | 107 | 05JUL2020 | 01JAN2026 | 12 | 16 | 10 | 45 | 66 | 24197 | 1 | 13.33 | Medium | NO | 13.33 | 45 |
| 16 | Amit | QA_ANALYST | usa | Fully_Remote | 107 | 05JUL2020 | 01FEB2026 | 12 | 16 | 10 | 45 | 66 | 24197 | 1 | 13.33 | Medium | NO | 13.33 | 45 |
| 17 | Kiran | REPORT_ANALYST | india | Fully_Remote | 112 | 20DEC2020 | 01JAN2026 | 7 | 9 | 5 | 26 | 61 | 24197 | 1 | 12.86 | Medium | NO | 12.86 | 26 |
| 18 | Kiran | REPORT_ANALYST | india | Fully_Remote | 112 | 20DEC2020 | 01FEB2026 | 7 | 9 | 5 | 26 | 61 | 24197 | 1 | 12.86 | Medium | NO | 12.86 | 26 |
| 19 | Pooja | RISK_ANALYST | uk | Hybrid | 108 | 18AUG2019 | 01JAN2026 | 6 | 7 | 3 | 20 | 77 | 24197 | 1 | 11.67 | Medium | NO | 11.67 | 20 |
| 20 | Pooja | RISK_ANALYST | uk | Hybrid | 108 | 18AUG2019 | 01FEB2026 | 6 | 7 | 3 | 20 | 77 | 24197 | 1 | 11.67 | Medium | NO | 11.67 | 20 |
| 21 | Divya | SAFETY_ANALYST | india | Fully_Remote | 115 | 30MAR2021 | 01JAN2026 | 8 | 10 | 6 | 25 | 58 | 24197 | 1 | 12.50 | Medium | NO | 12.50 | 25 |
| 22 | Divya | SAFETY_ANALYST | india | Fully_Remote | 115 | 30MAR2021 | 01FEB2026 | 8 | 10 | 6 | 25 | 58 | 24197 | 1 | 12.50 | Medium | NO | 12.50 | 25 |
| 23 | Anjali | SAS_PROGRAMMER | india | Hybrid | 102 | 12FEB2021 | 01JAN2026 | 10 | 14 | 8 | 28 | 59 | 24197 | 1 | 14.00 | Medium | NO | 14.00 | 28 |
| 24 | Anjali | SAS_PROGRAMMER | india | Hybrid | 102 | 12FEB2021 | 01FEB2026 | 10 | 14 | 8 | 28 | 59 | 24197 | 1 | 14.00 | Medium | NO | 14.00 | 28 |
| 25 | Rakesh | SQL_DEVELOPER | india | Fully_Remote | 109 | 01SEP2022 | 01JAN2026 | 8 | 11 | 6 | 27 | 40 | 24197 | 1 | 13.75 | Medium | NO | 13.75 | 27 |
| 26 | Rakesh | SQL_DEVELOPER | india | Fully_Remote | 109 | 01SEP2022 | 01FEB2026 | 8 | 11 | 6 | 27 | 40 | 24197 | 1 | 13.75 | Medium | NO | 13.75 | 27 |
| 27 | Rahul | STATISTICIAN | usa | Fully_Remote | 103 | 20MAR2020 | 01JAN2026 | 11 | 15 | 9 | 35 | 70 | 24197 | 1 | 13.64 | Medium | NO | 13.64 | 35 |
| 28 | Rahul | STATISTICIAN | usa | Fully_Remote | 103 | 20MAR2020 | 01FEB2026 | 11 | 15 | 9 | 35 | 70 | 24197 | 1 | 13.64 | Medium | NO | 13.64 | 35 |
| 29 | Meena | VALIDATION_LEAD | uk | Fully_Remote | 113 | 05JAN2019 | 01JAN2026 | 11 | 15 | 9 | 42 | 84 | 24197 | 1 | 13.64 | Medium | NO | 13.64 | 42 |
| 30 | Meena | VALIDATION_LEAD | uk | Fully_Remote | 113 | 05JAN2019 | 01FEB2026 | 11 | 15 | 9 | 42 | 84 | 24197 | 1 | 13.64 | Medium | NO | 13.64 | 42 |
18. PROC DATASETS CLEANUP
proc datasets library=work nolist;
delete prod_t;
quit;
LOG:
18. BUSINESS INSIGHTS
1. High
productivity ≠ long hours
2. Burnout strongly correlates with
communication overload
3. Fraud patterns visible through hour-task
mismatch
4. Utilization macro simplifies HR
reporting
19. INTERVIEW TALKING POINTS
· Why
use INTCK
vs INTNX
·
Macro vs Data Step logic
·
Fraud detection logic explanation
·
SQL vs PROC MEANS
·
Visualization choice
20. CONCLUSION
This project demonstrates real corporate SAS analytics, combining:
·
HR analytics
·
Productivity modeling
·
Fraud detection
·
Macro automation
·
Statistical validation
This project demonstrates a
complete, real-world SAS analytics workflow for managing and monitoring remote
employees. Starting from raw data creation, it applies data cleaning, date
intelligence, character and numeric functions, macros, SQL, and statistical
procedures to derive meaningful insights. Productivity and utilization are
quantified using business-driven logic, while fraud detection rules highlight
potential misuse of work hours. Procedures like PROC MEANS, PROC UNIVARIATE,
PROC FREQ, PROC CORR, and PROC SGPLOT support evidence-based decision-making.
Overall, the project reflects how SAS is effectively used in corporate
environments for performance tracking, risk identification, and scalable,
automated reporting.
INTERVIEW QUESTIONS FOR YOU
1.What is the difference between SET, MERGE, and PROC APPEND in SAS?
2.When would you prefer PROC SQL over DATA step in SAS?
3.What is the difference between INTCK and INTNX functions?
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
About the Author:
1.What is the difference between SET, MERGE, and PROC APPEND in SAS?
2.When would you prefer PROC SQL over DATA step in SAS?
3.What is the difference between INTCK and INTNX functions?
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
About the Author:
SAS Learning Hub is a data analytics and SAS programming platform focused on clinical, financial, and real-world data analysis. The content is created by professionals with academic training in Pharmaceutics and hands-on experience in Base SAS, PROC SQL, Macros, SDTM, and ADaM, providing practical and industry-relevant SAS learning resources.
Disclaimer:
The datasets and analysis in this article are created for educational and demonstration purposes only. They do not represent REMOTE WORK data.
Our Mission:
This blog provides industry-focused SAS programming tutorials and analytics projects covering finance, healthcare, and technology.
This project is suitable for:
· Students learning SAS
· Data analysts building portfolios
· Professionals preparing for SAS interviews
· Bloggers writing about analytics and smart cities
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Comments
Post a Comment