386.Can SAS Really Predict Microfinance Loan Risk and Fraud Before It Happens?
Can SAS Really Predict Microfinance Loan Risk and Fraud Before It Happens?
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HERE IN THIS PROJECT WE USED THESE SAS STATEMENTS —
DATA STEP | PROC SQL | PROC PRINT | PROC SGPLOT | MACROS | PROC CORR | PROC MEANS | PROC FREQ | PROC UNIVARIATE | APPEND | PROC SORT | MERGE | PROC DATASETS DELETE | DATA FUNCTIONS
Table Of Contents
- Introduction
- Business Context
- Dataset Creation (15+ observations with dates)
- Data Manipulation (SET, MERGE, APPEND, TRANSPOSE)
- Character & Numeric Functions
- Utilization Classification Macro
- Fraud Detection Macro
- PROC SQL Analytics
- PROC FREQ
- PROC MEANS
- PROC UNIVARIATE
- PROC CORR
- PROC SGPLOT
- PROC DATASETS (Delete)
1. INTRODUCTION
Microfinance plays a critical role in financial inclusion, especially in countries like India where millions of people lack access to formal banking systems. Microfinance institutions (MFIs) provide small loans to low-income individuals for:
- Agriculture
- Small businesses
- Education
- Livelihood activities
However, MFIs face serious challenges:
- Loan defaults
- Fraudulent borrowers
- High-risk regions
- Poor repayment discipline
This project simulates a real microfinance loan system and demonstrates how SAS analytics can be used to:
- Identify high-risk customers
- Detect fraud
- Analyze loan behavior
- Support business decisions
2. BUSINESS CONTEXT
Imagine you are a SAS Programmer at a Microfinance Company.
Your management asks:
"We have thousands of loans. Which ones are risky? Which regions default more? Are we giving loans to fraud customers?"
You are given loan data and asked to build:
- Risk classification logic
- Fraud detection model
- Analytical dashboards
- Statistical insights
This project is exactly that.
3. DATASET CREATION
data microfinance_loans;
input Loan_ID $ Region $ Loan_Amount Interest_Rate Repayment_Duration Default_Flag
Loan_Date:date9.;
format Loan_Date date9.;
datalines;
L001 South 50000 12.5 24 0 01JAN2023
L002 North 75000 14.2 36 1 15FEB2023
L003 East 30000 10.8 12 0 10MAR2023
L004 West 120000 16.5 48 1 20APR2023
L005 South 45000 11.2 18 0 05MAY2023
L006 North 95000 17.8 60 1 11JUN2023
L007 East 38000 9.5 12 0 02JUL2023
L008 West 200000 19.2 72 1 19AUG2023
L009 South 60000 13.5 24 0 10SEP2023
L010 North 82000 15.1 36 1 21OCT2023
L011 East 28000 10.0 12 0 11NOV2023
L012 West 150000 18.4 60 1 15DEC2023
L013 South 55000 12.0 24 0 05JAN2024
L014 North 90000 16.0 48 1 22FEB2024
L015 East 35000 11.0 18 0 10MAR2024
L016 West 180000 20.0 72 1 01APR2024
;
run;
proc print data=microfinance_loans;
run;
OUTPUT:
| Obs | Loan_ID | Region | Loan_Amount | Interest_Rate | Repayment_Duration | Default_Flag | Loan_Date |
|---|---|---|---|---|---|---|---|
| 1 | L001 | South | 50000 | 12.5 | 24 | 0 | 01JAN2023 |
| 2 | L002 | North | 75000 | 14.2 | 36 | 1 | 15FEB2023 |
| 3 | L003 | East | 30000 | 10.8 | 12 | 0 | 10MAR2023 |
| 4 | L004 | West | 120000 | 16.5 | 48 | 1 | 20APR2023 |
| 5 | L005 | South | 45000 | 11.2 | 18 | 0 | 05MAY2023 |
| 6 | L006 | North | 95000 | 17.8 | 60 | 1 | 11JUN2023 |
| 7 | L007 | East | 38000 | 9.5 | 12 | 0 | 02JUL2023 |
| 8 | L008 | West | 200000 | 19.2 | 72 | 1 | 19AUG2023 |
| 9 | L009 | South | 60000 | 13.5 | 24 | 0 | 10SEP2023 |
| 10 | L010 | North | 82000 | 15.1 | 36 | 1 | 21OCT2023 |
| 11 | L011 | East | 28000 | 10.0 | 12 | 0 | 11NOV2023 |
| 12 | L012 | West | 150000 | 18.4 | 60 | 1 | 15DEC2023 |
| 13 | L013 | South | 55000 | 12.0 | 24 | 0 | 05JAN2024 |
| 14 | L014 | North | 90000 | 16.0 | 48 | 1 | 22FEB2024 |
| 15 | L015 | East | 35000 | 11.0 | 18 | 0 | 10MAR2024 |
| 16 | L016 | West | 180000 | 20.0 | 72 | 1 | 01APR2024 |
Used for row-level transformations.
Used to read dataset sequentially.
4. CHARACTER & NUMERIC FUNCTIONS
data loans_clean;
set microfinance_loans;
Region_Proper = propcase(strip(Region));
Loan_Category = catx('-', Region_Proper, Loan_ID);
Risk_Score = Loan_Amount * Interest_Rate / Repayment_Duration;
Loan_Amount_Filled = coalesce(Loan_Amount, 0);
Region_Upper = upcase(Region);
Region_Lower = lowcase(Region);
run;
proc print data=loans_clean;
var Loan_ID Region Region_Proper Loan_Category Risk_Score Loan_Amount_Filled
Region_Upper Region_Lower;
run;
OUTPUT:
| Obs | Loan_ID | Region | Region_Proper | Loan_Category | Risk_Score | Loan_Amount_Filled | Region_Upper | Region_Lower |
|---|---|---|---|---|---|---|---|---|
| 1 | L001 | South | South | South-L001 | 26041.67 | 50000 | SOUTH | south |
| 2 | L002 | North | North | North-L002 | 29583.33 | 75000 | NORTH | north |
| 3 | L003 | East | East | East-L003 | 27000.00 | 30000 | EAST | east |
| 4 | L004 | West | West | West-L004 | 41250.00 | 120000 | WEST | west |
| 5 | L005 | South | South | South-L005 | 28000.00 | 45000 | SOUTH | south |
| 6 | L006 | North | North | North-L006 | 28183.33 | 95000 | NORTH | north |
| 7 | L007 | East | East | East-L007 | 30083.33 | 38000 | EAST | east |
| 8 | L008 | West | West | West-L008 | 53333.33 | 200000 | WEST | west |
| 9 | L009 | South | South | South-L009 | 33750.00 | 60000 | SOUTH | south |
| 10 | L010 | North | North | North-L010 | 34394.44 | 82000 | NORTH | north |
| 11 | L011 | East | East | East-L011 | 23333.33 | 28000 | EAST | east |
| 12 | L012 | West | West | West-L012 | 46000.00 | 150000 | WEST | west |
| 13 | L013 | South | South | South-L013 | 27500.00 | 55000 | SOUTH | south |
| 14 | L014 | North | North | North-L014 | 30000.00 | 90000 | NORTH | north |
| 15 | L015 | East | East | East-L015 | 21388.89 | 35000 | EAST | east |
| 16 | L016 | West | West | West-L016 | 50000.00 | 180000 | WEST | west |
Used to standardize text data for joins and reports.
Used for risk calculations and scoring.
5. DATE FUNCTIONS (MDY, INTCK, INTNX)
data loans_dates;
set loans_clean;
Review_Date = intnx('month', Loan_Date, 6);
Loan_Age_Months = intck('month', Loan_Date, today());
format Review_Date date9.;
run;
proc print data=loans_dates;
run;
OUTPUT:
| Obs | Loan_ID | Region | Loan_Amount | Interest_Rate | Repayment_Duration | Default_Flag | Loan_Date | Region_Proper | Loan_Category | Risk_Score | Loan_Amount_Filled | Region_Upper | Region_Lower | Review_Date | Loan_Age_Months |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | L001 | South | 50000 | 12.5 | 24 | 0 | 01JAN2023 | South | South-L001 | 26041.67 | 50000 | SOUTH | south | 01JUL2023 | 36 |
| 2 | L002 | North | 75000 | 14.2 | 36 | 1 | 15FEB2023 | North | North-L002 | 29583.33 | 75000 | NORTH | north | 01AUG2023 | 35 |
| 3 | L003 | East | 30000 | 10.8 | 12 | 0 | 10MAR2023 | East | East-L003 | 27000.00 | 30000 | EAST | east | 01SEP2023 | 34 |
| 4 | L004 | West | 120000 | 16.5 | 48 | 1 | 20APR2023 | West | West-L004 | 41250.00 | 120000 | WEST | west | 01OCT2023 | 33 |
| 5 | L005 | South | 45000 | 11.2 | 18 | 0 | 05MAY2023 | South | South-L005 | 28000.00 | 45000 | SOUTH | south | 01NOV2023 | 32 |
| 6 | L006 | North | 95000 | 17.8 | 60 | 1 | 11JUN2023 | North | North-L006 | 28183.33 | 95000 | NORTH | north | 01DEC2023 | 31 |
| 7 | L007 | East | 38000 | 9.5 | 12 | 0 | 02JUL2023 | East | East-L007 | 30083.33 | 38000 | EAST | east | 01JAN2024 | 30 |
| 8 | L008 | West | 200000 | 19.2 | 72 | 1 | 19AUG2023 | West | West-L008 | 53333.33 | 200000 | WEST | west | 01FEB2024 | 29 |
| 9 | L009 | South | 60000 | 13.5 | 24 | 0 | 10SEP2023 | South | South-L009 | 33750.00 | 60000 | SOUTH | south | 01MAR2024 | 28 |
| 10 | L010 | North | 82000 | 15.1 | 36 | 1 | 21OCT2023 | North | North-L010 | 34394.44 | 82000 | NORTH | north | 01APR2024 | 27 |
| 11 | L011 | East | 28000 | 10.0 | 12 | 0 | 11NOV2023 | East | East-L011 | 23333.33 | 28000 | EAST | east | 01MAY2024 | 26 |
| 12 | L012 | West | 150000 | 18.4 | 60 | 1 | 15DEC2023 | West | West-L012 | 46000.00 | 150000 | WEST | west | 01JUN2024 | 25 |
| 13 | L013 | South | 55000 | 12.0 | 24 | 0 | 05JAN2024 | South | South-L013 | 27500.00 | 55000 | SOUTH | south | 01JUL2024 | 24 |
| 14 | L014 | North | 90000 | 16.0 | 48 | 1 | 22FEB2024 | North | North-L014 | 30000.00 | 90000 | NORTH | north | 01AUG2024 | 23 |
| 15 | L015 | East | 35000 | 11.0 | 18 | 0 | 10MAR2024 | East | East-L015 | 21388.89 | 35000 | EAST | east | 01SEP2024 | 22 |
| 16 | L016 | West | 180000 | 20.0 | 72 | 1 | 01APR2024 | West | West-L016 | 50000.00 | 180000 | WEST | west | 01OCT2024 | 21 |
Used for:
· Loan aging
· Review cycles
· Compliance timelines
6. UTILIZATION CLASSIFICATION MACRO
%macro utilization;
data utilization_flag;
set loans_dates;
length Utilization $8.;
if Loan_Amount >= 100000 then Loan_Amount = "High";
else if Loan_Amount >= 50000 then Utilization = "Medium";
else Utilization = "Low";
run;
proc print data=utilization_flag;
run;
%mend;
%utilization;
OUTPUT:
| Obs | Loan_ID | Region | Loan_Amount | Interest_Rate | Repayment_Duration | Default_Flag | Loan_Date | Region_Proper | Loan_Category | Risk_Score | Loan_Amount_Filled | Region_Upper | Region_Lower | Review_Date | Loan_Age_Months | Utilization |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | L001 | South | 50000 | 12.5 | 24 | 0 | 01JAN2023 | South | South-L001 | 26041.67 | 50000 | SOUTH | south | 01JUL2023 | 36 | Medium |
| 2 | L002 | North | 75000 | 14.2 | 36 | 1 | 15FEB2023 | North | North-L002 | 29583.33 | 75000 | NORTH | north | 01AUG2023 | 35 | Medium |
| 3 | L003 | East | 30000 | 10.8 | 12 | 0 | 10MAR2023 | East | East-L003 | 27000.00 | 30000 | EAST | east | 01SEP2023 | 34 | Low |
| 4 | L004 | West | . | 16.5 | 48 | 1 | 20APR2023 | West | West-L004 | 41250.00 | 120000 | WEST | west | 01OCT2023 | 33 | |
| 5 | L005 | South | 45000 | 11.2 | 18 | 0 | 05MAY2023 | South | South-L005 | 28000.00 | 45000 | SOUTH | south | 01NOV2023 | 32 | Low |
| 6 | L006 | North | 95000 | 17.8 | 60 | 1 | 11JUN2023 | North | North-L006 | 28183.33 | 95000 | NORTH | north | 01DEC2023 | 31 | Medium |
| 7 | L007 | East | 38000 | 9.5 | 12 | 0 | 02JUL2023 | East | East-L007 | 30083.33 | 38000 | EAST | east | 01JAN2024 | 30 | Low |
| 8 | L008 | West | . | 19.2 | 72 | 1 | 19AUG2023 | West | West-L008 | 53333.33 | 200000 | WEST | west | 01FEB2024 | 29 | |
| 9 | L009 | South | 60000 | 13.5 | 24 | 0 | 10SEP2023 | South | South-L009 | 33750.00 | 60000 | SOUTH | south | 01MAR2024 | 28 | Medium |
| 10 | L010 | North | 82000 | 15.1 | 36 | 1 | 21OCT2023 | North | North-L010 | 34394.44 | 82000 | NORTH | north | 01APR2024 | 27 | Medium |
| 11 | L011 | East | 28000 | 10.0 | 12 | 0 | 11NOV2023 | East | East-L011 | 23333.33 | 28000 | EAST | east | 01MAY2024 | 26 | Low |
| 12 | L012 | West | . | 18.4 | 60 | 1 | 15DEC2023 | West | West-L012 | 46000.00 | 150000 | WEST | west | 01JUN2024 | 25 | |
| 13 | L013 | South | 55000 | 12.0 | 24 | 0 | 05JAN2024 | South | South-L013 | 27500.00 | 55000 | SOUTH | south | 01JUL2024 | 24 | Medium |
| 14 | L014 | North | 90000 | 16.0 | 48 | 1 | 22FEB2024 | North | North-L014 | 30000.00 | 90000 | NORTH | north | 01AUG2024 | 23 | Medium |
| 15 | L015 | East | 35000 | 11.0 | 18 | 0 | 10MAR2024 | East | East-L015 | 21388.89 | 35000 | EAST | east | 01SEP2024 | 22 | Low |
| 16 | L016 | West | . | 20.0 | 72 | 1 | 01APR2024 | West | West-L016 | 50000.00 | 180000 | WEST | west | 01OCT2024 | 21 |
Used for:
· Automation
· Reusability
· Enterprise scale processing
7. FRAUD DETECTION MACRO
%macro fraud_detection;
data fraud_flag;
set utilization_flag;
if Default_Flag=1 and Loan_Amount>100000 and Interest_Rate>15 then Fraud="Yes";
else Fraud="No";
run;
proc print data=fraud_flag;
run;
%mend;
%fraud_detection;
OUTPUT:
| Obs | Loan_ID | Region | Loan_Amount | Interest_Rate | Repayment_Duration | Default_Flag | Loan_Date | Region_Proper | Loan_Category | Risk_Score | Loan_Amount_Filled | Region_Upper | Region_Lower | Review_Date | Loan_Age_Months | Utilization | Fraud |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | L001 | South | 50000 | 12.5 | 24 | 0 | 01JAN2023 | South | South-L001 | 26041.67 | 50000 | SOUTH | south | 01JUL2023 | 36 | Medium | No |
| 2 | L002 | North | 75000 | 14.2 | 36 | 1 | 15FEB2023 | North | North-L002 | 29583.33 | 75000 | NORTH | north | 01AUG2023 | 35 | Medium | No |
| 3 | L003 | East | 30000 | 10.8 | 12 | 0 | 10MAR2023 | East | East-L003 | 27000.00 | 30000 | EAST | east | 01SEP2023 | 34 | Low | No |
| 4 | L004 | West | . | 16.5 | 48 | 1 | 20APR2023 | West | West-L004 | 41250.00 | 120000 | WEST | west | 01OCT2023 | 33 | No | |
| 5 | L005 | South | 45000 | 11.2 | 18 | 0 | 05MAY2023 | South | South-L005 | 28000.00 | 45000 | SOUTH | south | 01NOV2023 | 32 | Low | No |
| 6 | L006 | North | 95000 | 17.8 | 60 | 1 | 11JUN2023 | North | North-L006 | 28183.33 | 95000 | NORTH | north | 01DEC2023 | 31 | Medium | No |
| 7 | L007 | East | 38000 | 9.5 | 12 | 0 | 02JUL2023 | East | East-L007 | 30083.33 | 38000 | EAST | east | 01JAN2024 | 30 | Low | No |
| 8 | L008 | West | . | 19.2 | 72 | 1 | 19AUG2023 | West | West-L008 | 53333.33 | 200000 | WEST | west | 01FEB2024 | 29 | No | |
| 9 | L009 | South | 60000 | 13.5 | 24 | 0 | 10SEP2023 | South | South-L009 | 33750.00 | 60000 | SOUTH | south | 01MAR2024 | 28 | Medium | No |
| 10 | L010 | North | 82000 | 15.1 | 36 | 1 | 21OCT2023 | North | North-L010 | 34394.44 | 82000 | NORTH | north | 01APR2024 | 27 | Medium | No |
| 11 | L011 | East | 28000 | 10.0 | 12 | 0 | 11NOV2023 | East | East-L011 | 23333.33 | 28000 | EAST | east | 01MAY2024 | 26 | Low | No |
| 12 | L012 | West | . | 18.4 | 60 | 1 | 15DEC2023 | West | West-L012 | 46000.00 | 150000 | WEST | west | 01JUN2024 | 25 | No | |
| 13 | L013 | South | 55000 | 12.0 | 24 | 0 | 05JAN2024 | South | South-L013 | 27500.00 | 55000 | SOUTH | south | 01JUL2024 | 24 | Medium | No |
| 14 | L014 | North | 90000 | 16.0 | 48 | 1 | 22FEB2024 | North | North-L014 | 30000.00 | 90000 | NORTH | north | 01AUG2024 | 23 | Medium | No |
| 15 | L015 | East | 35000 | 11.0 | 18 | 0 | 10MAR2024 | East | East-L015 | 21388.89 | 35000 | EAST | east | 01SEP2024 | 22 | Low | No |
| 16 | L016 | West | . | 20.0 | 72 | 1 | 01APR2024 | West | West-L016 | 50000.00 | 180000 | WEST | west | 01OCT2024 | 21 | No |
8. PROC SQL ANALYTICS
proc sql;
select Region,
count(*) as Total_Loans,
avg(Loan_Amount) as Avg_Loan,
sum(Default_Flag) as Defaults
from fraud_flag
group by Region;
quit;
OUTPUT:
| Region | Total_Loans | Avg_Loan | Defaults |
|---|---|---|---|
| East | 4 | 32750 | 0 |
| North | 4 | 85500 | 4 |
| South | 4 | 52500 | 0 |
| West | 4 | . | 4 |
Used for:
· Business aggregation
· Executive dashboards
9. PROC FREQ
proc freq data=fraud_flag;
tables Region*Fraud / nocum;
run;
OUTPUT:
The FREQ Procedure
|
| ||||||||||||||||||||||||
Used for:
· Fraud pattern detection
· Risk segmentation
10. PROC MEANS
proc means data=fraud_flag mean min max std;
var Loan_Amount Interest_Rate Risk_Score;
run;
OUTPUT:
The MEANS Procedure
| Variable | Mean | Minimum | Maximum | Std Dev |
|---|---|---|---|---|
Loan_Amount Interest_Rate Risk_Score | 56916.67 14.2312500 33115.10 | 28000.00 9.5000000 21388.89 | 95000.00 20.0000000 53333.33 | 23554.42 3.4381621 9529.30 |
Used for:
· Portfolio health
· Exposure analysis
11. PROC UNIVARIATE
proc univariate data=fraud_flag;
var Loan_Amount;
histogram Loan_Amount;
run;
OUTPUT:
The UNIVARIATE Procedure
Variable: Loan_Amount
| Moments | |||
|---|---|---|---|
| N | 12 | Sum Weights | 12 |
| Mean | 56916.6667 | Sum Observations | 683000 |
| Std Deviation | 23554.418 | Variance | 554810606 |
| Skewness | 0.41873121 | Kurtosis | -1.2640358 |
| Uncorrected SS | 4.4977E10 | Corrected SS | 6102916667 |
| Coeff Variation | 41.3840433 | Std Error Mean | 6799.57478 |
| Basic Statistical Measures | |||
|---|---|---|---|
| Location | Variability | ||
| Mean | 56916.67 | Std Deviation | 23554 |
| Median | 52500.00 | Variance | 554810606 |
| Mode | . | Range | 67000 |
| Interquartile Range | 42000 | ||
| Tests for Location: Mu0=0 | ||||
|---|---|---|---|---|
| Test | Statistic | p Value | ||
| Student's t | t | 8.370621 | Pr > |t| | <.0001 |
| Sign | M | 6 | Pr >= |M| | 0.0005 |
| Signed Rank | S | 39 | Pr >= |S| | 0.0005 |
| Quantiles (Definition 5) | |
|---|---|
| Level | Quantile |
| 100% Max | 95000 |
| 99% | 95000 |
| 95% | 95000 |
| 90% | 90000 |
| 75% Q3 | 78500 |
| 50% Median | 52500 |
| 25% Q1 | 36500 |
| 10% | 30000 |
| 5% | 28000 |
| 1% | 28000 |
| 0% Min | 28000 |
| Extreme Observations | |||
|---|---|---|---|
| Lowest | Highest | ||
| Value | Obs | Value | Obs |
| 28000 | 11 | 60000 | 9 |
| 30000 | 3 | 75000 | 2 |
| 35000 | 15 | 82000 | 10 |
| 38000 | 7 | 90000 | 14 |
| 45000 | 5 | 95000 | 6 |
| Missing Values | |||
|---|---|---|---|
| Missing Value | Count | Percent Of | |
| All Obs | Missing Obs | ||
| . | 4 | 25.00 | 100.00 |
The UNIVARIATE Procedure
Used for:
· Outlier detection
· Distribution analysis
12. PROC CORR
proc corr data=fraud_flag;
var Loan_Amount Interest_Rate Risk_Score Default_Flag;
run;
OUTPUT:
The CORR Procedure
| 4 Variables: | Loan_Amount Interest_Rate Risk_Score Default_Flag |
|---|
| Simple Statistics | ||||||
|---|---|---|---|---|---|---|
| Variable | N | Mean | Std Dev | Sum | Minimum | Maximum |
| Loan_Amount | 12 | 56917 | 23554 | 683000 | 28000 | 95000 |
| Interest_Rate | 16 | 14.23125 | 3.43816 | 227.70000 | 9.50000 | 20.00000 |
| Risk_Score | 16 | 33115 | 9529 | 529842 | 21389 | 53333 |
| Default_Flag | 16 | 0.50000 | 0.51640 | 8.00000 | 0 | 1.00000 |
| Pearson Correlation Coefficients Prob > |r| under H0: Rho=0 Number of Observations | ||||
|---|---|---|---|---|
| Loan_Amount | Interest_Rate | Risk_Score | Default_Flag | |
| Loan_Amount | 1.00000 12 | 0.96508 <.0001 12 | 0.57848 0.0488 12 | 0.89623 <.0001 12 |
| Interest_Rate | 0.96508 <.0001 12 | 1.00000 16 | 0.81914 0.0001 16 | 0.87677 <.0001 16 |
| Risk_Score | 0.57848 0.0488 12 | 0.81914 0.0001 16 | 1.00000 16 | 0.64790 0.0066 16 |
| Default_Flag | 0.89623 <.0001 12 | 0.87677 <.0001 16 | 0.64790 0.0066 16 | 1.00000 16 |
Used for:
· Understanding relationships
· Model building inputs
13. PROC SGPLOT
proc sgplot data=fraud_flag;
vbar Region / response=Loan_Amount stat=mean;
run;
OUTPUT:
Used for:
· Management visualization
· KPI dashboards
14. TRANSPOSE, MERGE, APPEND
proc transpose data=fraud_flag out=loan_t;
var Loan_Amount Interest_Rate;
run;
proc print data=loan_t;
run;
OUTPUT:
| Obs | _NAME_ | COL1 | COL2 | COL3 | COL4 | COL5 | COL6 | COL7 | COL8 | COL9 | COL10 | COL11 | COL12 | COL13 | COL14 | COL15 | COL16 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Loan_Amount | 50000.0 | 75000.0 | 30000.0 | . | 45000.0 | 95000.0 | 38000.0 | . | 60000.0 | 82000.0 | 28000 | . | 55000 | 90000 | 35000 | . |
| 2 | Interest_Rate | 12.5 | 14.2 | 10.8 | 16.5 | 11.2 | 17.8 | 9.5 | 19.2 | 13.5 | 15.1 | 10 | 18.4 | 12 | 16 | 11 | 20 |
proc append base=microfinance_loans
data=fraud_flag force;
run;
proc print data=microfinance_loans;
run;
OUTPUT:
| Obs | Loan_ID | Region | Loan_Amount | Interest_Rate | Repayment_Duration | Default_Flag | Loan_Date |
|---|---|---|---|---|---|---|---|
| 1 | L001 | South | 50000 | 12.5 | 24 | 0 | 01JAN2023 |
| 2 | L002 | North | 75000 | 14.2 | 36 | 1 | 15FEB2023 |
| 3 | L003 | East | 30000 | 10.8 | 12 | 0 | 10MAR2023 |
| 4 | L004 | West | 120000 | 16.5 | 48 | 1 | 20APR2023 |
| 5 | L005 | South | 45000 | 11.2 | 18 | 0 | 05MAY2023 |
| 6 | L006 | North | 95000 | 17.8 | 60 | 1 | 11JUN2023 |
| 7 | L007 | East | 38000 | 9.5 | 12 | 0 | 02JUL2023 |
| 8 | L008 | West | 200000 | 19.2 | 72 | 1 | 19AUG2023 |
| 9 | L009 | South | 60000 | 13.5 | 24 | 0 | 10SEP2023 |
| 10 | L010 | North | 82000 | 15.1 | 36 | 1 | 21OCT2023 |
| 11 | L011 | East | 28000 | 10.0 | 12 | 0 | 11NOV2023 |
| 12 | L012 | West | 150000 | 18.4 | 60 | 1 | 15DEC2023 |
| 13 | L013 | South | 55000 | 12.0 | 24 | 0 | 05JAN2024 |
| 14 | L014 | North | 90000 | 16.0 | 48 | 1 | 22FEB2024 |
| 15 | L015 | East | 35000 | 11.0 | 18 | 0 | 10MAR2024 |
| 16 | L016 | West | 180000 | 20.0 | 72 | 1 | 01APR2024 |
| 17 | L001 | South | 50000 | 12.5 | 24 | 0 | 01JAN2023 |
| 18 | L002 | North | 75000 | 14.2 | 36 | 1 | 15FEB2023 |
| 19 | L003 | East | 30000 | 10.8 | 12 | 0 | 10MAR2023 |
| 20 | L004 | West | . | 16.5 | 48 | 1 | 20APR2023 |
| 21 | L005 | South | 45000 | 11.2 | 18 | 0 | 05MAY2023 |
| 22 | L006 | North | 95000 | 17.8 | 60 | 1 | 11JUN2023 |
| 23 | L007 | East | 38000 | 9.5 | 12 | 0 | 02JUL2023 |
| 24 | L008 | West | . | 19.2 | 72 | 1 | 19AUG2023 |
| 25 | L009 | South | 60000 | 13.5 | 24 | 0 | 10SEP2023 |
| 26 | L010 | North | 82000 | 15.1 | 36 | 1 | 21OCT2023 |
| 27 | L011 | East | 28000 | 10.0 | 12 | 0 | 11NOV2023 |
| 28 | L012 | West | . | 18.4 | 60 | 1 | 15DEC2023 |
| 29 | L013 | South | 55000 | 12.0 | 24 | 0 | 05JAN2024 |
| 30 | L014 | North | 90000 | 16.0 | 48 | 1 | 22FEB2024 |
| 31 | L015 | East | 35000 | 11.0 | 18 | 0 | 10MAR2024 |
| 32 | L016 | West | . | 20.0 | 72 | 1 | 01APR2024 |
Used to add new data into historical tables.
proc sort data=microfinance_loans;by Loan_ID;run;
proc print data=microfinance_loans;
run;
OUTPUT:
| Obs | Loan_ID | Region | Loan_Amount | Interest_Rate | Repayment_Duration | Default_Flag | Loan_Date |
|---|---|---|---|---|---|---|---|
| 1 | L001 | South | 50000 | 12.5 | 24 | 0 | 01JAN2023 |
| 2 | L001 | South | 50000 | 12.5 | 24 | 0 | 01JAN2023 |
| 3 | L002 | North | 75000 | 14.2 | 36 | 1 | 15FEB2023 |
| 4 | L002 | North | 75000 | 14.2 | 36 | 1 | 15FEB2023 |
| 5 | L003 | East | 30000 | 10.8 | 12 | 0 | 10MAR2023 |
| 6 | L003 | East | 30000 | 10.8 | 12 | 0 | 10MAR2023 |
| 7 | L004 | West | 120000 | 16.5 | 48 | 1 | 20APR2023 |
| 8 | L004 | West | . | 16.5 | 48 | 1 | 20APR2023 |
| 9 | L005 | South | 45000 | 11.2 | 18 | 0 | 05MAY2023 |
| 10 | L005 | South | 45000 | 11.2 | 18 | 0 | 05MAY2023 |
| 11 | L006 | North | 95000 | 17.8 | 60 | 1 | 11JUN2023 |
| 12 | L006 | North | 95000 | 17.8 | 60 | 1 | 11JUN2023 |
| 13 | L007 | East | 38000 | 9.5 | 12 | 0 | 02JUL2023 |
| 14 | L007 | East | 38000 | 9.5 | 12 | 0 | 02JUL2023 |
| 15 | L008 | West | 200000 | 19.2 | 72 | 1 | 19AUG2023 |
| 16 | L008 | West | . | 19.2 | 72 | 1 | 19AUG2023 |
| 17 | L009 | South | 60000 | 13.5 | 24 | 0 | 10SEP2023 |
| 18 | L009 | South | 60000 | 13.5 | 24 | 0 | 10SEP2023 |
| 19 | L010 | North | 82000 | 15.1 | 36 | 1 | 21OCT2023 |
| 20 | L010 | North | 82000 | 15.1 | 36 | 1 | 21OCT2023 |
| 21 | L011 | East | 28000 | 10.0 | 12 | 0 | 11NOV2023 |
| 22 | L011 | East | 28000 | 10.0 | 12 | 0 | 11NOV2023 |
| 23 | L012 | West | 150000 | 18.4 | 60 | 1 | 15DEC2023 |
| 24 | L012 | West | . | 18.4 | 60 | 1 | 15DEC2023 |
| 25 | L013 | South | 55000 | 12.0 | 24 | 0 | 05JAN2024 |
| 26 | L013 | South | 55000 | 12.0 | 24 | 0 | 05JAN2024 |
| 27 | L014 | North | 90000 | 16.0 | 48 | 1 | 22FEB2024 |
| 28 | L014 | North | 90000 | 16.0 | 48 | 1 | 22FEB2024 |
| 29 | L015 | East | 35000 | 11.0 | 18 | 0 | 10MAR2024 |
| 30 | L015 | East | 35000 | 11.0 | 18 | 0 | 10MAR2024 |
| 31 | L016 | West | 180000 | 20.0 | 72 | 1 | 01APR2024 |
| 32 | L016 | West | . | 20.0 | 72 | 1 | 01APR2024 |
proc sort data=fraud_flag;by Loan_ID;run;
proc print data=fraud_flag;
run;
OUTPUT:
| Obs | Loan_ID | Region | Loan_Amount | Interest_Rate | Repayment_Duration | Default_Flag | Loan_Date | Region_Proper | Loan_Category | Risk_Score | Loan_Amount_Filled | Region_Upper | Region_Lower | Review_Date | Loan_Age_Months | Utilization | Fraud |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | L001 | South | 50000 | 12.5 | 24 | 0 | 01JAN2023 | South | South-L001 | 26041.67 | 50000 | SOUTH | south | 01JUL2023 | 36 | Medium | No |
| 2 | L002 | North | 75000 | 14.2 | 36 | 1 | 15FEB2023 | North | North-L002 | 29583.33 | 75000 | NORTH | north | 01AUG2023 | 35 | Medium | No |
| 3 | L003 | East | 30000 | 10.8 | 12 | 0 | 10MAR2023 | East | East-L003 | 27000.00 | 30000 | EAST | east | 01SEP2023 | 34 | Low | No |
| 4 | L004 | West | . | 16.5 | 48 | 1 | 20APR2023 | West | West-L004 | 41250.00 | 120000 | WEST | west | 01OCT2023 | 33 | No | |
| 5 | L005 | South | 45000 | 11.2 | 18 | 0 | 05MAY2023 | South | South-L005 | 28000.00 | 45000 | SOUTH | south | 01NOV2023 | 32 | Low | No |
| 6 | L006 | North | 95000 | 17.8 | 60 | 1 | 11JUN2023 | North | North-L006 | 28183.33 | 95000 | NORTH | north | 01DEC2023 | 31 | Medium | No |
| 7 | L007 | East | 38000 | 9.5 | 12 | 0 | 02JUL2023 | East | East-L007 | 30083.33 | 38000 | EAST | east | 01JAN2024 | 30 | Low | No |
| 8 | L008 | West | . | 19.2 | 72 | 1 | 19AUG2023 | West | West-L008 | 53333.33 | 200000 | WEST | west | 01FEB2024 | 29 | No | |
| 9 | L009 | South | 60000 | 13.5 | 24 | 0 | 10SEP2023 | South | South-L009 | 33750.00 | 60000 | SOUTH | south | 01MAR2024 | 28 | Medium | No |
| 10 | L010 | North | 82000 | 15.1 | 36 | 1 | 21OCT2023 | North | North-L010 | 34394.44 | 82000 | NORTH | north | 01APR2024 | 27 | Medium | No |
| 11 | L011 | East | 28000 | 10.0 | 12 | 0 | 11NOV2023 | East | East-L011 | 23333.33 | 28000 | EAST | east | 01MAY2024 | 26 | Low | No |
| 12 | L012 | West | . | 18.4 | 60 | 1 | 15DEC2023 | West | West-L012 | 46000.00 | 150000 | WEST | west | 01JUN2024 | 25 | No | |
| 13 | L013 | South | 55000 | 12.0 | 24 | 0 | 05JAN2024 | South | South-L013 | 27500.00 | 55000 | SOUTH | south | 01JUL2024 | 24 | Medium | No |
| 14 | L014 | North | 90000 | 16.0 | 48 | 1 | 22FEB2024 | North | North-L014 | 30000.00 | 90000 | NORTH | north | 01AUG2024 | 23 | Medium | No |
| 15 | L015 | East | 35000 | 11.0 | 18 | 0 | 10MAR2024 | East | East-L015 | 21388.89 | 35000 | EAST | east | 01SEP2024 | 22 | Low | No |
| 16 | L016 | West | . | 20.0 | 72 | 1 | 01APR2024 | West | West-L016 | 50000.00 | 180000 | WEST | west | 01OCT2024 | 21 | No |
data merged_data;
merge microfinance_loans
fraud_flag;
by Loan_ID;
run;
proc print data=fraud_flag;
run;
OUTPUT:
| Obs | Loan_ID | Region | Loan_Amount | Interest_Rate | Repayment_Duration | Default_Flag | Loan_Date | Region_Proper | Loan_Category | Risk_Score | Loan_Amount_Filled | Region_Upper | Region_Lower | Review_Date | Loan_Age_Months | Utilization | Fraud |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | L001 | South | 50000 | 12.5 | 24 | 0 | 01JAN2023 | South | South-L001 | 26041.67 | 50000 | SOUTH | south | 01JUL2023 | 36 | Medium | No |
| 2 | L002 | North | 75000 | 14.2 | 36 | 1 | 15FEB2023 | North | North-L002 | 29583.33 | 75000 | NORTH | north | 01AUG2023 | 35 | Medium | No |
| 3 | L003 | East | 30000 | 10.8 | 12 | 0 | 10MAR2023 | East | East-L003 | 27000.00 | 30000 | EAST | east | 01SEP2023 | 34 | Low | No |
| 4 | L004 | West | . | 16.5 | 48 | 1 | 20APR2023 | West | West-L004 | 41250.00 | 120000 | WEST | west | 01OCT2023 | 33 | No | |
| 5 | L005 | South | 45000 | 11.2 | 18 | 0 | 05MAY2023 | South | South-L005 | 28000.00 | 45000 | SOUTH | south | 01NOV2023 | 32 | Low | No |
| 6 | L006 | North | 95000 | 17.8 | 60 | 1 | 11JUN2023 | North | North-L006 | 28183.33 | 95000 | NORTH | north | 01DEC2023 | 31 | Medium | No |
| 7 | L007 | East | 38000 | 9.5 | 12 | 0 | 02JUL2023 | East | East-L007 | 30083.33 | 38000 | EAST | east | 01JAN2024 | 30 | Low | No |
| 8 | L008 | West | . | 19.2 | 72 | 1 | 19AUG2023 | West | West-L008 | 53333.33 | 200000 | WEST | west | 01FEB2024 | 29 | No | |
| 9 | L009 | South | 60000 | 13.5 | 24 | 0 | 10SEP2023 | South | South-L009 | 33750.00 | 60000 | SOUTH | south | 01MAR2024 | 28 | Medium | No |
| 10 | L010 | North | 82000 | 15.1 | 36 | 1 | 21OCT2023 | North | North-L010 | 34394.44 | 82000 | NORTH | north | 01APR2024 | 27 | Medium | No |
| 11 | L011 | East | 28000 | 10.0 | 12 | 0 | 11NOV2023 | East | East-L011 | 23333.33 | 28000 | EAST | east | 01MAY2024 | 26 | Low | No |
| 12 | L012 | West | . | 18.4 | 60 | 1 | 15DEC2023 | West | West-L012 | 46000.00 | 150000 | WEST | west | 01JUN2024 | 25 | No | |
| 13 | L013 | South | 55000 | 12.0 | 24 | 0 | 05JAN2024 | South | South-L013 | 27500.00 | 55000 | SOUTH | south | 01JUL2024 | 24 | Medium | No |
| 14 | L014 | North | 90000 | 16.0 | 48 | 1 | 22FEB2024 | North | North-L014 | 30000.00 | 90000 | NORTH | north | 01AUG2024 | 23 | Medium | No |
| 15 | L015 | East | 35000 | 11.0 | 18 | 0 | 10MAR2024 | East | East-L015 | 21388.89 | 35000 | EAST | east | 01SEP2024 | 22 | Low | No |
| 16 | L016 | West | . | 20.0 | 72 | 1 | 01APR2024 | West | West-L016 | 50000.00 | 180000 | WEST | west | 01OCT2024 | 21 | No |
Used to combine datasets by key.
15. PROC DATASETS DELETE
proc datasets library=work;
delete loan_t;
quit;
LOG:
Used for:
· Memory optimization
· Housekeeping in production
HOW THIS HELPS IN INTERVIEWS
This single project covers:
Topic | Covered |
Base SAS | Yes |
SQL | Yes |
Macros | Yes |
Dates | Yes |
Statistics | Yes |
Business Logic | Yes |
Fraud Detection | Yes |
Risk Modeling | Yes |
Visualization | Yes |
Data Engineering | Yes |
Conclusion
This project clearly shows how SAS can be used as a powerful business tool, not just a programming language. By using a simple microfinance loan dataset, we were able to simulate real-world problems like loan risk, customer default, and fraud detection.
Through procedures like PROC SQL, PROC MEANS, PROC FREQ, PROC UNIVARIATE, PROC CORR, and PROC SGPLOT, we converted raw loan data into meaningful business insights. The use of macros helped automate complex logic, making the system reusable and scalable for thousands of records in real companies.
In simple words, this project proves that:
· SAS helps identify risky customers early
· SAS supports data-driven lending decisions
· SAS reduces financial loss and fraud
· SAS improves portfolio performance
INTERVIEW QUESTIONS FOR YOU
· What is a macro variable?
· What is the difference between %LET and CALL SYMPUT?
· What is the difference between SYMGET and SYMPUT?
About the Author:
SAS Learning Hub is a data analytics and SAS programming platform focused on clinical, financial, and real-world data analysis. The content is created by professionals with academic training in Pharmaceutics and hands-on experience in Base SAS, PROC SQL, Macros, SDTM, and ADaM, providing practical and industry-relevant SAS learning resources.
Disclaimer:
The datasets and analysis in this article are created for educational and demonstration purposes only. They do not represent Microfinanace Loan data.
Our Mission:
This blog provides industry-focused SAS programming tutorials and analytics projects covering finance, healthcare, and technology.
This project is suitable for:
· Students learning SAS
· Data analysts building portfolios
· Professionals preparing for SAS interviews
· Bloggers writing about analytics and smart cities
· EV and energy industry professionals
To deepen your understanding of SAS analytics, please refer to our other data science and industry-focused projects listed below:
1.How Fast Is Our Planet Warming? A Data-Driven Climate Change Study in SAS
2.How Efficient Are Indian Highways? What Can SAS Reveal About Traffic, Length, and Performance?
3.What Really Happens on a Website? Can SAS Reveal User Behavior and Traffic Patterns?
Comments
Post a Comment