ANALYZING GLOBAL BANK FRAUDSTERS DATASET USING PROC PRINT | PROC MEANS | PROC FREQ | PROC SQL | PROC SGPLOT | PROC MACRO | IN SAS FOR INSIGHTS ON FINANCIAL CRIME TYPES TRENDS PENALTIES RISKS WORLDWIDE
/*Different types of bank fraudsters worldwide*/
STEP 1: DATASET CREATION
data Bank_Fraudsters;
length Fraud_ID 8
Fraudster_Name $25
Country $20
Fraud_Type $30
Amount_Lost_USD 8
Year 8
Method $25
Caught $3
Penalty_Years 8;
input Fraud_ID Fraudster_Name $ Country $ Fraud_Type $ Amount_Lost_USD Year Method $ Caught $ Penalty_Years;
datalines;
1 John_Doe USA Identity_Theft 125000 2019 Online_Yield Yes 5
2 Raj_Patel India Card_Cloning 84000 2020 ATM_Hack No .
3 Mei_Wang China Money_Laundering 430000 2021 Shell_Companies Yes 10
4 Carlos_Soto Mexico Loan_Scam 59000 2018 Fake_Records Yes 3
5 Fatima_Ali UAE Phishing 72000 2022 Email_Spoofing No .
6 Ivan_Kuznetsov Russia Crypto_Scam 205000 2021 Blockchain_Fraud No .
7 Jean_Martin France Investment_Fraud 110000 2020 Ponzi_Scheme Yes 6
8 Linda_Kim South_Korea Skimming 63000 2019 ATM_Device Yes 4
9 George_Smith UK Account_Takeover 95000 2022 SIM_Swap No .
10 Akira_Tanaka Japan Check_Fraud 27000 2018 Forged_Cheques Yes 2
11 Ahmed_Khan Pakistan Online_Scam 34000 2021 Fake_Websites No .
12 Maria_Rojas Brazil Money_Laundering 470000 2020 Shell_Companies Yes 8
13 Bongani_Ndlovu South_Africa Phishing 59000 2019 Email_Spoofing Yes 3
14 Emily_Wilson Canada Identity_Theft 88000 2022 Fake_Documents No .
15 Elena_Popov Ukraine Card_Cloning 77000 2021 ATM_Hack Yes 5
16 Andrew_Blake Australia Loan_Scam 66000 2020 Fake_Credit No .
17 Hassan_Musa Nigeria Investment_Fraud 92000 2021 Ponzi_Scheme No .
18 Ingrid_Andersson Sweden Online_Scam 58000 2019 Fake_Ecommerce Yes 2
19 Gopal_Sharma India SIM_Swap 69000 2022 Mobile_Fraud Yes 4
20 Tommy_Nguyen Vietnam Crypto_Scam 230000 2021 Rug_Pull No .
;
run;
proc print;run;
Output:
| Obs | Fraud_ID | Fraudster_Name | Country | Fraud_Type | Amount_Lost_USD | Year | Method | Caught | Penalty_Years |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | John_Doe | USA | Identity_Theft | 125000 | 2019 | Online_Yield | Yes | 5 |
| 2 | 2 | Raj_Patel | India | Card_Cloning | 84000 | 2020 | ATM_Hack | No | . |
| 3 | 3 | Mei_Wang | China | Money_Laundering | 430000 | 2021 | Shell_Companies | Yes | 10 |
| 4 | 4 | Carlos_Soto | Mexico | Loan_Scam | 59000 | 2018 | Fake_Records | Yes | 3 |
| 5 | 5 | Fatima_Ali | UAE | Phishing | 72000 | 2022 | Email_Spoofing | No | . |
| 6 | 6 | Ivan_Kuznetsov | Russia | Crypto_Scam | 205000 | 2021 | Blockchain_Fraud | No | . |
| 7 | 7 | Jean_Martin | France | Investment_Fraud | 110000 | 2020 | Ponzi_Scheme | Yes | 6 |
| 8 | 8 | Linda_Kim | South_Korea | Skimming | 63000 | 2019 | ATM_Device | Yes | 4 |
| 9 | 9 | George_Smith | UK | Account_Takeover | 95000 | 2022 | SIM_Swap | No | . |
| 10 | 10 | Akira_Tanaka | Japan | Check_Fraud | 27000 | 2018 | Forged_Cheques | Yes | 2 |
| 11 | 11 | Ahmed_Khan | Pakistan | Online_Scam | 34000 | 2021 | Fake_Websites | No | . |
| 12 | 12 | Maria_Rojas | Brazil | Money_Laundering | 470000 | 2020 | Shell_Companies | Yes | 8 |
| 13 | 13 | Bongani_Ndlovu | South_Africa | Phishing | 59000 | 2019 | Email_Spoofing | Yes | 3 |
| 14 | 14 | Emily_Wilson | Canada | Identity_Theft | 88000 | 2022 | Fake_Documents | No | . |
| 15 | 15 | Elena_Popov | Ukraine | Card_Cloning | 77000 | 2021 | ATM_Hack | Yes | 5 |
| 16 | 16 | Andrew_Blake | Australia | Loan_Scam | 66000 | 2020 | Fake_Credit | No | . |
| 17 | 17 | Hassan_Musa | Nigeria | Investment_Fraud | 92000 | 2021 | Ponzi_Scheme | No | . |
| 18 | 18 | Ingrid_Andersson | Sweden | Online_Scam | 58000 | 2019 | Fake_Ecommerce | Yes | 2 |
| 19 | 19 | Gopal_Sharma | India | SIM_Swap | 69000 | 2022 | Mobile_Fraud | Yes | 4 |
| 20 | 20 | Tommy_Nguyen | Vietnam | Crypto_Scam | 230000 | 2021 | Rug_Pull | No | . |
STEP 2: PROC PRINT
proc print data=Bank_Fraudsters;
title "Complete Bank Fraudsters Dataset";
run;
Output:
| Complete Bank Fraudsters Dataset |
| Obs | Fraud_ID | Fraudster_Name | Country | Fraud_Type | Amount_Lost_USD | Year | Method | Caught | Penalty_Years |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | John_Doe | USA | Identity_Theft | 125000 | 2019 | Online_Yield | Yes | 5 |
| 2 | 2 | Raj_Patel | India | Card_Cloning | 84000 | 2020 | ATM_Hack | No | . |
| 3 | 3 | Mei_Wang | China | Money_Laundering | 430000 | 2021 | Shell_Companies | Yes | 10 |
| 4 | 4 | Carlos_Soto | Mexico | Loan_Scam | 59000 | 2018 | Fake_Records | Yes | 3 |
| 5 | 5 | Fatima_Ali | UAE | Phishing | 72000 | 2022 | Email_Spoofing | No | . |
| 6 | 6 | Ivan_Kuznetsov | Russia | Crypto_Scam | 205000 | 2021 | Blockchain_Fraud | No | . |
| 7 | 7 | Jean_Martin | France | Investment_Fraud | 110000 | 2020 | Ponzi_Scheme | Yes | 6 |
| 8 | 8 | Linda_Kim | South_Korea | Skimming | 63000 | 2019 | ATM_Device | Yes | 4 |
| 9 | 9 | George_Smith | UK | Account_Takeover | 95000 | 2022 | SIM_Swap | No | . |
| 10 | 10 | Akira_Tanaka | Japan | Check_Fraud | 27000 | 2018 | Forged_Cheques | Yes | 2 |
| 11 | 11 | Ahmed_Khan | Pakistan | Online_Scam | 34000 | 2021 | Fake_Websites | No | . |
| 12 | 12 | Maria_Rojas | Brazil | Money_Laundering | 470000 | 2020 | Shell_Companies | Yes | 8 |
| 13 | 13 | Bongani_Ndlovu | South_Africa | Phishing | 59000 | 2019 | Email_Spoofing | Yes | 3 |
| 14 | 14 | Emily_Wilson | Canada | Identity_Theft | 88000 | 2022 | Fake_Documents | No | . |
| 15 | 15 | Elena_Popov | Ukraine | Card_Cloning | 77000 | 2021 | ATM_Hack | Yes | 5 |
| 16 | 16 | Andrew_Blake | Australia | Loan_Scam | 66000 | 2020 | Fake_Credit | No | . |
| 17 | 17 | Hassan_Musa | Nigeria | Investment_Fraud | 92000 | 2021 | Ponzi_Scheme | No | . |
| 18 | 18 | Ingrid_Andersson | Sweden | Online_Scam | 58000 | 2019 | Fake_Ecommerce | Yes | 2 |
| 19 | 19 | Gopal_Sharma | India | SIM_Swap | 69000 | 2022 | Mobile_Fraud | Yes | 4 |
| 20 | 20 | Tommy_Nguyen | Vietnam | Crypto_Scam | 230000 | 2021 | Rug_Pull | No | . |
STEP 3: PROC MEANS
proc means data=Bank_Fraudsters mean sum min max maxdec=2;
var Amount_Lost_USD Penalty_Years;
title "Descriptive Statistics of Financial Loss and Penalties";
run;
Output:
| Descriptive Statistics of Financial Loss and Penalties |
| Variable | Mean | Sum | Minimum | Maximum | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
STEP 4: PROC FREQ
1. By Country:
proc freq data=Bank_Fraudsters;
tables Country / nocum nopercent;
title "Fraud Cases by Country";
run;
Output:
| Fraud Cases by Country |
| Country | Frequency |
|---|---|
| Australia | 1 |
| Brazil | 1 |
| Canada | 1 |
| China | 1 |
| France | 1 |
| India | 2 |
| Japan | 1 |
| Mexico | 1 |
| Nigeria | 1 |
| Pakistan | 1 |
| Russia | 1 |
| South_Africa | 1 |
| South_Korea | 1 |
| Sweden | 1 |
| UAE | 1 |
| UK | 1 |
| USA | 1 |
| Ukraine | 1 |
| Vietnam | 1 |
2. By Fraud Type:
proc freq data=Bank_Fraudsters;
tables Fraud_Type;
title "Frequency of Fraud Types";
run;
Output:
| Frequency of Fraud Types |
| Fraud_Type | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
|---|---|---|---|---|
| Account_Takeover | 1 | 5.00 | 1 | 5.00 |
| Card_Cloning | 2 | 10.00 | 3 | 15.00 |
| Check_Fraud | 1 | 5.00 | 4 | 20.00 |
| Crypto_Scam | 2 | 10.00 | 6 | 30.00 |
| Identity_Theft | 2 | 10.00 | 8 | 40.00 |
| Investment_Fraud | 2 | 10.00 | 10 | 50.00 |
| Loan_Scam | 2 | 10.00 | 12 | 60.00 |
| Money_Laundering | 2 | 10.00 | 14 | 70.00 |
| Online_Scam | 2 | 10.00 | 16 | 80.00 |
| Phishing | 2 | 10.00 | 18 | 90.00 |
| SIM_Swap | 1 | 5.00 | 19 | 95.00 |
| Skimming | 1 | 5.00 | 20 | 100.00 |
3. Caught vs Not Caught:
proc freq data=Bank_Fraudsters;
tables Caught;
title "Caught vs Not Caught Fraudsters";
run;
Output:
| Caught vs Not Caught Fraudsters |
| Caught | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
|---|---|---|---|---|
| No | 9 | 45.00 | 9 | 45.00 |
| Yes | 11 | 55.00 | 20 | 100.00 |
STEP 5: PROC SQL ANALYSIS
1. Top 5 Highest Losses
proc sql outobs=5;
title "Top 5 Fraud Cases by Amount Lost";
select Fraudster_Name, Country, Fraud_Type, Amount_Lost_USD
from Bank_Fraudsters
order by Amount_Lost_USD desc;
quit;
Output:
| Top 5 Fraud Cases by Amount Lost |
| Fraudster_Name | Country | Fraud_Type | Amount_Lost_USD |
|---|---|---|---|
| Maria_Rojas | Brazil | Money_Laundering | 470000 |
| Mei_Wang | China | Money_Laundering | 430000 |
| Tommy_Nguyen | Vietnam | Crypto_Scam | 230000 |
| Ivan_Kuznetsov | Russia | Crypto_Scam | 205000 |
| John_Doe | USA | Identity_Theft | 125000 |
2. Average Penalty for Caught Fraudsters
proc sql;
title "Average Penalty (in Years) for Caught Fraudsters";
select avg(Penalty_Years) as Avg_Penalty_Years
from Bank_Fraudsters
where Caught = 'Yes';
quit;
Output:
| Average Penalty (in Years) for Caught Fraudsters |
| Avg_Penalty_Years |
|---|
| 4.727273 |
3. Total Fraud Amount by Fraud Type
proc sql;
title "Total Amount Lost by Fraud Type";
select Fraud_Type, sum(Amount_Lost_USD) as Total_Loss format=dollar12.
from Bank_Fraudsters
group by Fraud_Type;
quit;
Output:
| Total Amount Lost by Fraud Type |
| Fraud_Type | Total_Loss |
|---|---|
| Account_Takeover | $95,000 |
| Card_Cloning | $161,000 |
| Check_Fraud | $27,000 |
| Crypto_Scam | $435,000 |
| Identity_Theft | $213,000 |
| Investment_Fraud | $202,000 |
| Loan_Scam | $125,000 |
| Money_Laundering | $900,000 |
| Online_Scam | $92,000 |
| Phishing | $131,000 |
| SIM_Swap | $69,000 |
| Skimming | $63,000 |
STEP 6: USING MACROS FOR REPEATABLE ANALYSIS
%macro CountryFraud(countryname);
title "Fraud Summary for &countryname.";
proc sql;
select Fraudster_Name, Fraud_Type, Amount_Lost_USD, Caught, Penalty_Years
from Bank_Fraudsters
where Country = "&countryname.";
quit;
%mend;
%CountryFraud(India);
Output:
| Fraud Summary for India |
| Fraudster_Name | Fraud_Type | Amount_Lost_USD | Caught | Penalty_Years |
|---|---|---|---|---|
| Raj_Patel | Card_Cloning | 84000 | No | . |
| Gopal_Sharma | SIM_Swap | 69000 | Yes | 4 |
%CountryFraud(USA);
Output:
| Fraud Summary for USA |
| Fraudster_Name | Fraud_Type | Amount_Lost_USD | Caught | Penalty_Years |
|---|---|---|---|---|
| John_Doe | Identity_Theft | 125000 | Yes | 5 |
Automate fraud type statistics:
%macro FraudTypeStats(type);
title "Statistics for &type. Fraud Type";
proc sql;
select count(*) as Total_Cases, avg(Amount_Lost_USD) as Avg_Loss
from Bank_Fraudsters
where Fraud_Type = "&type.";
quit;
%mend;
%FraudTypeStats(Phishing);
Output:
| Statistics for Phishing Fraud Type |
| Total_Cases | Avg_Loss |
|---|---|
| 2 | 65500 |
%FraudTypeStats(Money_Laundering);
Output:
| Statistics for Money_Laundering Fraud Type |
| Total_Cases | Avg_Loss |
|---|---|
| 2 | 450000 |
STEP 7: CONDITIONAL DERIVED COLUMN
data Bank_Fraudsters_Categorized;
set Bank_Fraudsters;
length Risk_Level $10;
if Amount_Lost_USD >= 200000 then Risk_Level = 'High';
else if Amount_Lost_USD >= 75000 then Risk_Level = 'Medium';
else Risk_Level = 'Low';
run;
proc print;run;
Output:
| Obs | Fraud_ID | Fraudster_Name | Country | Fraud_Type | Amount_Lost_USD | Year | Method | Caught | Penalty_Years | Risk_Level |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | John_Doe | USA | Identity_Theft | 125000 | 2019 | Online_Yield | Yes | 5 | Medium |
| 2 | 2 | Raj_Patel | India | Card_Cloning | 84000 | 2020 | ATM_Hack | No | . | Medium |
| 3 | 3 | Mei_Wang | China | Money_Laundering | 430000 | 2021 | Shell_Companies | Yes | 10 | High |
| 4 | 4 | Carlos_Soto | Mexico | Loan_Scam | 59000 | 2018 | Fake_Records | Yes | 3 | Low |
| 5 | 5 | Fatima_Ali | UAE | Phishing | 72000 | 2022 | Email_Spoofing | No | . | Low |
| 6 | 6 | Ivan_Kuznetsov | Russia | Crypto_Scam | 205000 | 2021 | Blockchain_Fraud | No | . | High |
| 7 | 7 | Jean_Martin | France | Investment_Fraud | 110000 | 2020 | Ponzi_Scheme | Yes | 6 | Medium |
| 8 | 8 | Linda_Kim | South_Korea | Skimming | 63000 | 2019 | ATM_Device | Yes | 4 | Low |
| 9 | 9 | George_Smith | UK | Account_Takeover | 95000 | 2022 | SIM_Swap | No | . | Medium |
| 10 | 10 | Akira_Tanaka | Japan | Check_Fraud | 27000 | 2018 | Forged_Cheques | Yes | 2 | Low |
| 11 | 11 | Ahmed_Khan | Pakistan | Online_Scam | 34000 | 2021 | Fake_Websites | No | . | Low |
| 12 | 12 | Maria_Rojas | Brazil | Money_Laundering | 470000 | 2020 | Shell_Companies | Yes | 8 | High |
| 13 | 13 | Bongani_Ndlovu | South_Africa | Phishing | 59000 | 2019 | Email_Spoofing | Yes | 3 | Low |
| 14 | 14 | Emily_Wilson | Canada | Identity_Theft | 88000 | 2022 | Fake_Documents | No | . | Medium |
| 15 | 15 | Elena_Popov | Ukraine | Card_Cloning | 77000 | 2021 | ATM_Hack | Yes | 5 | Medium |
| 16 | 16 | Andrew_Blake | Australia | Loan_Scam | 66000 | 2020 | Fake_Credit | No | . | Low |
| 17 | 17 | Hassan_Musa | Nigeria | Investment_Fraud | 92000 | 2021 | Ponzi_Scheme | No | . | Medium |
| 18 | 18 | Ingrid_Andersson | Sweden | Online_Scam | 58000 | 2019 | Fake_Ecommerce | Yes | 2 | Low |
| 19 | 19 | Gopal_Sharma | India | SIM_Swap | 69000 | 2022 | Mobile_Fraud | Yes | 4 | Low |
| 20 | 20 | Tommy_Nguyen | Vietnam | Crypto_Scam | 230000 | 2021 | Rug_Pull | No | . | High |
proc freq data=Bank_Fraudsters_Categorized;
tables Risk_Level;
title "Fraud Risk Level Categorization";
run;
Output:
| Fraud Risk Level Categorization |
| Risk_Level | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
|---|---|---|---|---|
| High | 4 | 20.00 | 4 | 20.00 |
| Low | 9 | 45.00 | 13 | 65.00 |
| Medium | 7 | 35.00 | 20 | 100.00 |
STEP 8: VISUALIZATION
proc sgplot data=Bank_Fraudsters;
vbar Fraud_Type / response=Amount_Lost_USD stat=sum;
title "Total Loss by Fraud Type";
run;
No comments:
Post a Comment