393.Can Cold Storage Data Reveal Hidden Losses, Wastage, and Fraud? – A Complete SAS Analytics Project
Can Cold Storage Data Reveal Hidden Losses, Wastage, and Fraud? – A Complete SAS Analytics Project
HERE IN THIS PROJECT WE USED THESE SAS STATEMENTS —
DATA STEP | PROC SQL | PROC PRINT | PROC SGPLOT | MACROS | PROC CORR | PROC MEANS | PROC FREQ | PROC UNIVARIATE | APPEND | MERGE | PROC TRANSPOSE | PROC DATASETS DELETE | DATA FUNCTIONS
TABLE OF CONTENTS
1. Introduction
2. Business Context
3. Project Objectives
4. Dataset Design
5. Data Creation Using Data Step
6. Data Cleaning and Numeric and Character Functions Usage
7. Date Handling Strategy
8. Utilization Classification Macro
9. Fraud Detection Macro
10. PROC SQL Analysis
11. PROC MEANS Analysis
12. PROC UNIVARIATE Analysis
13. PROC FREQ Analysis
14. PROC CORR Analysis
15. PROC SGPLOT Visualization
16. APPEND, SET, MERGE, and TRANSPOSE Usage
17. PROC DATASETS Cleanup
18. Business Insights
19. Interview-Oriented Explanations
20. Conclusion
1.1.Introduction
In real life, cold storage facilities play a vital role in:
- Food supply chains
- Agriculture
- Dairy and meat preservation
- Pharmaceutical storage
A small mistake in temperature control, energy reporting, or date tracking can lead to:
- Heavy financial loss
- Product quality degradation
- False reporting
- Even fraud
2. Business Context
Imagine a company managing multiple cold storage units across locations.
Each unit stores different products:
· Fruits
· Vegetables
· Meat
· Dairy
Management wants answers to simple but powerful questions:
· Are storage units being used efficiently?
· Is high energy usage really justified?
· Why is spoilage low even when temperature fluctuates a lot?
· Are storage dates realistic or manipulated?
· Which product types lose quality faster?
3. PROJECT OBJECTIVES
1.Create a cold storage dataset with realistic variables
2.Apply date functions (MDY, INTCK, INTNX)
3.Classify storage utilization using macros
4.Detect fraud patterns using macros
5.Perform statistical profiling
6.Apply string manipulation functions
7.Use SET, MERGE, APPEND, TRANSPOSE
8.Generate graphs
9.Clean datasets using PROC DATASETS
4. DATASET DESIGN
Variables Used
Variable | Description |
Storage_ID | Unique cold storage unit |
Product_Type | Fruits, Vegetables, Meat, Dairy |
Temperature_Variation | °C fluctuation |
Spoilage_Rate | % spoiled |
Energy_Usage | kWh/day |
Storage_Days | Days stored |
Quality_Index | Derived score |
Start_Date | Storage start |
End_Date | Storage end |
Utilization_Class | Macro derived |
Fraud_Flag | Macro derived |
5. DATA CREATION USING DATA STEP
data cold_storage_raw;
length Storage_ID $6 Product_Type $15;
format Start_Date End_Date date9. Spoilage_Rate percent8.2;
do i = 1 to 20;
Storage_ID = cats("CS", put(i, z3.));
Product_Type = scan("Fruits Vegetables Meat Dairy", mod(i,4)+1);
Temperature_Variation = round(ranuni(123)*6,0.1);
Spoilage_Rate = round(ranuni(321)*0.25,0.01);
Energy_Usage = round(200 + ranuni(456)*400,1);
Storage_Days = ceil(10 + ranuni(789)*40);
Quality_Index = round(100 - (Temperature_Variation*5 + Spoilage_Rate*100),0.1);
Start_Date = mdy(1,1,2024) + i*5;
End_Date = Start_Date + Storage_Days;
output;
end;
drop i;
run;
proc print data=cold_storage_raw;
run;
OUTPUT:
| Obs | Storage_ID | Product_Type | Start_Date | End_Date | Spoilage_Rate | Temperature_Variation | Energy_Usage | Storage_Days | Quality_Index |
|---|---|---|---|---|---|---|---|---|---|
| 1 | CS001 | Vegetables | 06JAN2024 | 22FEB2024 | 8.00% | 4.5 | 271 | 47 | 69.5 |
| 2 | CS002 | Meat | 11JAN2024 | 06FEB2024 | 6.00% | 2.1 | 515 | 26 | 83.5 |
| 3 | CS003 | Dairy | 16JAN2024 | 13FEB2024 | 5.00% | 0.7 | 510 | 28 | 91.5 |
| 4 | CS004 | Fruits | 21JAN2024 | 23FEB2024 | 7.00% | 5.8 | 486 | 33 | 64.0 |
| 5 | CS005 | Vegetables | 26JAN2024 | 11MAR2024 | 22.00% | 3.2 | 257 | 45 | 62.0 |
| 6 | CS006 | Meat | 31JAN2024 | 24FEB2024 | 19.00% | 3.9 | 481 | 24 | 61.5 |
| 7 | CS007 | Dairy | 05FEB2024 | 11MAR2024 | 23.00% | 3.1 | 219 | 35 | 61.5 |
| 8 | CS008 | Fruits | 10FEB2024 | 04MAR2024 | 9.00% | 3.9 | 435 | 23 | 71.5 |
| 9 | CS009 | Vegetables | 15FEB2024 | 19MAR2024 | 7.00% | 3.2 | 411 | 33 | 77.0 |
| 10 | CS010 | Meat | 20FEB2024 | 28MAR2024 | 3.00% | 0.6 | 433 | 37 | 94.0 |
| 11 | CS011 | Dairy | 25FEB2024 | 14APR2024 | 4.00% | 2.9 | 500 | 49 | 81.5 |
| 12 | CS012 | Fruits | 01MAR2024 | 08APR2024 | 24.00% | 4.5 | 238 | 38 | 53.5 |
| 13 | CS013 | Vegetables | 06MAR2024 | 17APR2024 | 7.00% | 5.4 | 464 | 42 | 66.0 |
| 14 | CS014 | Meat | 11MAR2024 | 12APR2024 | 8.00% | 4.8 | 512 | 32 | 68.0 |
| 15 | CS015 | Dairy | 16MAR2024 | 29MAR2024 | 21.00% | 4.2 | 239 | 13 | 58.0 |
| 16 | CS016 | Fruits | 21MAR2024 | 28APR2024 | 6.00% | 1.5 | 517 | 38 | 86.5 |
| 17 | CS017 | Vegetables | 26MAR2024 | 03MAY2024 | 21.00% | 1.6 | 389 | 38 | 71.0 |
| 18 | CS018 | Meat | 31MAR2024 | 06MAY2024 | 14.00% | 4.2 | 230 | 36 | 65.0 |
| 19 | CS019 | Dairy | 05APR2024 | 06MAY2024 | 21.00% | 4.5 | 238 | 31 | 56.5 |
| 20 | CS020 | Fruits | 10APR2024 | 29APR2024 | 5.00% | 2.1 | 242 | 19 | 84.5 |
· Simulates 20 cold storage units
· Uses MDY() for date creation
· Uses randomization to simulate real-world variability
· Quality_Index decreases with temperature fluctuation and spoilage
6. DATA CLEANING & CHARACTER FUNCTIONS
data cold_storage_clean;
set cold_storage_raw;
Product_Type_Clean = propcase(strip(Product_Type));
Storage_ID_Upper = upcase(Storage_ID);
Storage_ID_Lower = lowcase(Storage_ID);
Label = catx("-", Storage_ID, substr(Product_Type,1,3));
run;
proc print data=cold_storage_clean;
run;
OUTPUT:
| Obs | Storage_ID | Product_Type | Start_Date | End_Date | Spoilage_Rate | Temperature_Variation | Energy_Usage | Storage_Days | Quality_Index | Product_Type_Clean | Storage_ID_Upper | Storage_ID_Lower | Label |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | CS001 | Vegetables | 06JAN2024 | 22FEB2024 | 8.00% | 4.5 | 271 | 47 | 69.5 | Vegetables | CS001 | cs001 | CS001-Veg |
| 2 | CS002 | Meat | 11JAN2024 | 06FEB2024 | 6.00% | 2.1 | 515 | 26 | 83.5 | Meat | CS002 | cs002 | CS002-Mea |
| 3 | CS003 | Dairy | 16JAN2024 | 13FEB2024 | 5.00% | 0.7 | 510 | 28 | 91.5 | Dairy | CS003 | cs003 | CS003-Dai |
| 4 | CS004 | Fruits | 21JAN2024 | 23FEB2024 | 7.00% | 5.8 | 486 | 33 | 64.0 | Fruits | CS004 | cs004 | CS004-Fru |
| 5 | CS005 | Vegetables | 26JAN2024 | 11MAR2024 | 22.00% | 3.2 | 257 | 45 | 62.0 | Vegetables | CS005 | cs005 | CS005-Veg |
| 6 | CS006 | Meat | 31JAN2024 | 24FEB2024 | 19.00% | 3.9 | 481 | 24 | 61.5 | Meat | CS006 | cs006 | CS006-Mea |
| 7 | CS007 | Dairy | 05FEB2024 | 11MAR2024 | 23.00% | 3.1 | 219 | 35 | 61.5 | Dairy | CS007 | cs007 | CS007-Dai |
| 8 | CS008 | Fruits | 10FEB2024 | 04MAR2024 | 9.00% | 3.9 | 435 | 23 | 71.5 | Fruits | CS008 | cs008 | CS008-Fru |
| 9 | CS009 | Vegetables | 15FEB2024 | 19MAR2024 | 7.00% | 3.2 | 411 | 33 | 77.0 | Vegetables | CS009 | cs009 | CS009-Veg |
| 10 | CS010 | Meat | 20FEB2024 | 28MAR2024 | 3.00% | 0.6 | 433 | 37 | 94.0 | Meat | CS010 | cs010 | CS010-Mea |
| 11 | CS011 | Dairy | 25FEB2024 | 14APR2024 | 4.00% | 2.9 | 500 | 49 | 81.5 | Dairy | CS011 | cs011 | CS011-Dai |
| 12 | CS012 | Fruits | 01MAR2024 | 08APR2024 | 24.00% | 4.5 | 238 | 38 | 53.5 | Fruits | CS012 | cs012 | CS012-Fru |
| 13 | CS013 | Vegetables | 06MAR2024 | 17APR2024 | 7.00% | 5.4 | 464 | 42 | 66.0 | Vegetables | CS013 | cs013 | CS013-Veg |
| 14 | CS014 | Meat | 11MAR2024 | 12APR2024 | 8.00% | 4.8 | 512 | 32 | 68.0 | Meat | CS014 | cs014 | CS014-Mea |
| 15 | CS015 | Dairy | 16MAR2024 | 29MAR2024 | 21.00% | 4.2 | 239 | 13 | 58.0 | Dairy | CS015 | cs015 | CS015-Dai |
| 16 | CS016 | Fruits | 21MAR2024 | 28APR2024 | 6.00% | 1.5 | 517 | 38 | 86.5 | Fruits | CS016 | cs016 | CS016-Fru |
| 17 | CS017 | Vegetables | 26MAR2024 | 03MAY2024 | 21.00% | 1.6 | 389 | 38 | 71.0 | Vegetables | CS017 | cs017 | CS017-Veg |
| 18 | CS018 | Meat | 31MAR2024 | 06MAY2024 | 14.00% | 4.2 | 230 | 36 | 65.0 | Meat | CS018 | cs018 | CS018-Mea |
| 19 | CS019 | Dairy | 05APR2024 | 06MAY2024 | 21.00% | 4.5 | 238 | 31 | 56.5 | Dairy | CS019 | cs019 | CS019-Dai |
| 20 | CS020 | Fruits | 10APR2024 | 29APR2024 | 5.00% | 2.1 | 242 | 19 | 84.5 | Fruits | CS020 | cs020 | CS020-Fru |
· STRIP/TRIM → removes spaces
· PROPCASE → business readable text
· COALESCE → avoids missing categories
· CATX → clean concatenation
7. DATE INTELLIGENCE
data cold_storage_dates;
set cold_storage_clean;
Storage_Months = intck('month', Start_Date, End_Date);
Review_Date = intnx('month', End_Date, 1, 's');
run;
proc print data=cold_storage_dates;
run;
OUTPUT:
| Obs | Storage_ID | Product_Type | Start_Date | End_Date | Spoilage_Rate | Temperature_Variation | Energy_Usage | Storage_Days | Quality_Index | Product_Type_Clean | Storage_ID_Upper | Storage_ID_Lower | Label | Storage_Months | Review_Date |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | CS001 | Vegetables | 06JAN2024 | 22FEB2024 | 8.00% | 4.5 | 271 | 47 | 69.5 | Vegetables | CS001 | cs001 | CS001-Veg | 1 | 23457 |
| 2 | CS002 | Meat | 11JAN2024 | 06FEB2024 | 6.00% | 2.1 | 515 | 26 | 83.5 | Meat | CS002 | cs002 | CS002-Mea | 1 | 23441 |
| 3 | CS003 | Dairy | 16JAN2024 | 13FEB2024 | 5.00% | 0.7 | 510 | 28 | 91.5 | Dairy | CS003 | cs003 | CS003-Dai | 1 | 23448 |
| 4 | CS004 | Fruits | 21JAN2024 | 23FEB2024 | 7.00% | 5.8 | 486 | 33 | 64.0 | Fruits | CS004 | cs004 | CS004-Fru | 1 | 23458 |
| 5 | CS005 | Vegetables | 26JAN2024 | 11MAR2024 | 22.00% | 3.2 | 257 | 45 | 62.0 | Vegetables | CS005 | cs005 | CS005-Veg | 2 | 23477 |
| 6 | CS006 | Meat | 31JAN2024 | 24FEB2024 | 19.00% | 3.9 | 481 | 24 | 61.5 | Meat | CS006 | cs006 | CS006-Mea | 1 | 23459 |
| 7 | CS007 | Dairy | 05FEB2024 | 11MAR2024 | 23.00% | 3.1 | 219 | 35 | 61.5 | Dairy | CS007 | cs007 | CS007-Dai | 1 | 23477 |
| 8 | CS008 | Fruits | 10FEB2024 | 04MAR2024 | 9.00% | 3.9 | 435 | 23 | 71.5 | Fruits | CS008 | cs008 | CS008-Fru | 1 | 23470 |
| 9 | CS009 | Vegetables | 15FEB2024 | 19MAR2024 | 7.00% | 3.2 | 411 | 33 | 77.0 | Vegetables | CS009 | cs009 | CS009-Veg | 1 | 23485 |
| 10 | CS010 | Meat | 20FEB2024 | 28MAR2024 | 3.00% | 0.6 | 433 | 37 | 94.0 | Meat | CS010 | cs010 | CS010-Mea | 1 | 23494 |
| 11 | CS011 | Dairy | 25FEB2024 | 14APR2024 | 4.00% | 2.9 | 500 | 49 | 81.5 | Dairy | CS011 | cs011 | CS011-Dai | 2 | 23510 |
| 12 | CS012 | Fruits | 01MAR2024 | 08APR2024 | 24.00% | 4.5 | 238 | 38 | 53.5 | Fruits | CS012 | cs012 | CS012-Fru | 1 | 23504 |
| 13 | CS013 | Vegetables | 06MAR2024 | 17APR2024 | 7.00% | 5.4 | 464 | 42 | 66.0 | Vegetables | CS013 | cs013 | CS013-Veg | 1 | 23513 |
| 14 | CS014 | Meat | 11MAR2024 | 12APR2024 | 8.00% | 4.8 | 512 | 32 | 68.0 | Meat | CS014 | cs014 | CS014-Mea | 1 | 23508 |
| 15 | CS015 | Dairy | 16MAR2024 | 29MAR2024 | 21.00% | 4.2 | 239 | 13 | 58.0 | Dairy | CS015 | cs015 | CS015-Dai | 0 | 23495 |
| 16 | CS016 | Fruits | 21MAR2024 | 28APR2024 | 6.00% | 1.5 | 517 | 38 | 86.5 | Fruits | CS016 | cs016 | CS016-Fru | 1 | 23524 |
| 17 | CS017 | Vegetables | 26MAR2024 | 03MAY2024 | 21.00% | 1.6 | 389 | 38 | 71.0 | Vegetables | CS017 | cs017 | CS017-Veg | 2 | 23530 |
| 18 | CS018 | Meat | 31MAR2024 | 06MAY2024 | 14.00% | 4.2 | 230 | 36 | 65.0 | Meat | CS018 | cs018 | CS018-Mea | 2 | 23533 |
| 19 | CS019 | Dairy | 05APR2024 | 06MAY2024 | 21.00% | 4.5 | 238 | 31 | 56.5 | Dairy | CS019 | cs019 | CS019-Dai | 1 | 23533 |
| 20 | CS020 | Fruits | 10APR2024 | 29APR2024 | 5.00% | 2.1 | 242 | 19 | 84.5 | Fruits | CS020 | cs020 | CS020-Fru | 0 | 23525 |
· INTCK → duration analysis
· INTNX → future review scheduling
8. UTILIZATION CLASSIFICATION MACRO
%macro utilization;
data cold_storage_util;
set cold_storage_dates;
length Utilization_Class $8.;
if Energy_Usage > 450 and Storage_Days > 30 then Utilization_Class="High";
else if 300 < Energy_Usage < 450 then Utilization_Class="Medium";
else Utilization_Class="Low";
run;
proc print data=cold_storage_util;
run;
%mend;
%utilization;
OUTPUT:
| Obs | Storage_ID | Product_Type | Start_Date | End_Date | Spoilage_Rate | Temperature_Variation | Energy_Usage | Storage_Days | Quality_Index | Product_Type_Clean | Storage_ID_Upper | Storage_ID_Lower | Label | Storage_Months | Review_Date | Utilization_Class |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | CS001 | Vegetables | 06JAN2024 | 22FEB2024 | 8.00% | 4.5 | 271 | 47 | 69.5 | Vegetables | CS001 | cs001 | CS001-Veg | 1 | 23457 | Low |
| 2 | CS002 | Meat | 11JAN2024 | 06FEB2024 | 6.00% | 2.1 | 515 | 26 | 83.5 | Meat | CS002 | cs002 | CS002-Mea | 1 | 23441 | Low |
| 3 | CS003 | Dairy | 16JAN2024 | 13FEB2024 | 5.00% | 0.7 | 510 | 28 | 91.5 | Dairy | CS003 | cs003 | CS003-Dai | 1 | 23448 | Low |
| 4 | CS004 | Fruits | 21JAN2024 | 23FEB2024 | 7.00% | 5.8 | 486 | 33 | 64.0 | Fruits | CS004 | cs004 | CS004-Fru | 1 | 23458 | High |
| 5 | CS005 | Vegetables | 26JAN2024 | 11MAR2024 | 22.00% | 3.2 | 257 | 45 | 62.0 | Vegetables | CS005 | cs005 | CS005-Veg | 2 | 23477 | Low |
| 6 | CS006 | Meat | 31JAN2024 | 24FEB2024 | 19.00% | 3.9 | 481 | 24 | 61.5 | Meat | CS006 | cs006 | CS006-Mea | 1 | 23459 | Low |
| 7 | CS007 | Dairy | 05FEB2024 | 11MAR2024 | 23.00% | 3.1 | 219 | 35 | 61.5 | Dairy | CS007 | cs007 | CS007-Dai | 1 | 23477 | Low |
| 8 | CS008 | Fruits | 10FEB2024 | 04MAR2024 | 9.00% | 3.9 | 435 | 23 | 71.5 | Fruits | CS008 | cs008 | CS008-Fru | 1 | 23470 | Medium |
| 9 | CS009 | Vegetables | 15FEB2024 | 19MAR2024 | 7.00% | 3.2 | 411 | 33 | 77.0 | Vegetables | CS009 | cs009 | CS009-Veg | 1 | 23485 | Medium |
| 10 | CS010 | Meat | 20FEB2024 | 28MAR2024 | 3.00% | 0.6 | 433 | 37 | 94.0 | Meat | CS010 | cs010 | CS010-Mea | 1 | 23494 | Medium |
| 11 | CS011 | Dairy | 25FEB2024 | 14APR2024 | 4.00% | 2.9 | 500 | 49 | 81.5 | Dairy | CS011 | cs011 | CS011-Dai | 2 | 23510 | High |
| 12 | CS012 | Fruits | 01MAR2024 | 08APR2024 | 24.00% | 4.5 | 238 | 38 | 53.5 | Fruits | CS012 | cs012 | CS012-Fru | 1 | 23504 | Low |
| 13 | CS013 | Vegetables | 06MAR2024 | 17APR2024 | 7.00% | 5.4 | 464 | 42 | 66.0 | Vegetables | CS013 | cs013 | CS013-Veg | 1 | 23513 | High |
| 14 | CS014 | Meat | 11MAR2024 | 12APR2024 | 8.00% | 4.8 | 512 | 32 | 68.0 | Meat | CS014 | cs014 | CS014-Mea | 1 | 23508 | High |
| 15 | CS015 | Dairy | 16MAR2024 | 29MAR2024 | 21.00% | 4.2 | 239 | 13 | 58.0 | Dairy | CS015 | cs015 | CS015-Dai | 0 | 23495 | Low |
| 16 | CS016 | Fruits | 21MAR2024 | 28APR2024 | 6.00% | 1.5 | 517 | 38 | 86.5 | Fruits | CS016 | cs016 | CS016-Fru | 1 | 23524 | High |
| 17 | CS017 | Vegetables | 26MAR2024 | 03MAY2024 | 21.00% | 1.6 | 389 | 38 | 71.0 | Vegetables | CS017 | cs017 | CS017-Veg | 2 | 23530 | Medium |
| 18 | CS018 | Meat | 31MAR2024 | 06MAY2024 | 14.00% | 4.2 | 230 | 36 | 65.0 | Meat | CS018 | cs018 | CS018-Mea | 2 | 23533 | Low |
| 19 | CS019 | Dairy | 05APR2024 | 06MAY2024 | 21.00% | 4.5 | 238 | 31 | 56.5 | Dairy | CS019 | cs019 | CS019-Dai | 1 | 23533 | Low |
| 20 | CS020 | Fruits | 10APR2024 | 29APR2024 | 5.00% | 2.1 | 242 | 19 | 84.5 | Fruits | CS020 | cs020 | CS020-Fru | 0 | 23525 | Low |
· High energy + long duration = High utilization
· Medium usage = Moderate
· Low usage = Underutilized
9. FRAUD DETECTION MACRO
%macro fraud_check;
data cold_storage_fraud;
set cold_storage_util;
if Spoilage_Rate < 8.0 and Temperature_Variation > 4 then Fraud_Flag="YES";
else if Quality_Index > 95 and Spoilage_Rate > 0.20 then Fraud_Flag="YES";
else Fraud_Flag="NO";
run;
proc print data=cold_storage_fraud;
var Storage_ID Product_Type Spoilage_Rate Temperature_Variation Quality_Index Fraud_Flag;
run;
%mend;
%fraud_check;
OUTPUT:
| Obs | Storage_ID | Product_Type | Spoilage_Rate | Temperature_Variation | Quality_Index | Fraud_Flag |
|---|---|---|---|---|---|---|
| 1 | CS001 | Vegetables | 8.00% | 4.5 | 69.5 | YES |
| 2 | CS002 | Meat | 6.00% | 2.1 | 83.5 | NO |
| 3 | CS003 | Dairy | 5.00% | 0.7 | 91.5 | NO |
| 4 | CS004 | Fruits | 7.00% | 5.8 | 64.0 | YES |
| 5 | CS005 | Vegetables | 22.00% | 3.2 | 62.0 | NO |
| 6 | CS006 | Meat | 19.00% | 3.9 | 61.5 | NO |
| 7 | CS007 | Dairy | 23.00% | 3.1 | 61.5 | NO |
| 8 | CS008 | Fruits | 9.00% | 3.9 | 71.5 | NO |
| 9 | CS009 | Vegetables | 7.00% | 3.2 | 77.0 | NO |
| 10 | CS010 | Meat | 3.00% | 0.6 | 94.0 | NO |
| 11 | CS011 | Dairy | 4.00% | 2.9 | 81.5 | NO |
| 12 | CS012 | Fruits | 24.00% | 4.5 | 53.5 | YES |
| 13 | CS013 | Vegetables | 7.00% | 5.4 | 66.0 | YES |
| 14 | CS014 | Meat | 8.00% | 4.8 | 68.0 | YES |
| 15 | CS015 | Dairy | 21.00% | 4.2 | 58.0 | YES |
| 16 | CS016 | Fruits | 6.00% | 1.5 | 86.5 | NO |
| 17 | CS017 | Vegetables | 21.00% | 1.6 | 71.0 | NO |
| 18 | CS018 | Meat | 14.00% | 4.2 | 65.0 | YES |
| 19 | CS019 | Dairy | 21.00% | 4.5 | 56.5 | YES |
| 20 | CS020 | Fruits | 5.00% | 2.1 | 84.5 | NO |
Fraud Scenarios
· High temperature but very low spoilage
· Very high quality but high spoilage
10. PROC SQL – BUSINESS QUERIES
proc sql;
create table storage_summary as
select Product_Type,
count(*) as Units,
mean(Spoilage_Rate) as Avg_Spoilage format=percent8.2,
mean(Energy_Usage) as Avg_Energy
from cold_storage_fraud
group by Product_Type;
quit;
proc print data=storage_summary;
run;
OUTPUT:
| Obs | Product_Type | Units | Avg_Spoilage | Avg_Energy |
|---|---|---|---|---|
| 1 | Dairy | 5 | 14.80% | 341.2 |
| 2 | Fruits | 5 | 10.20% | 383.6 |
| 3 | Meat | 5 | 10.00% | 434.2 |
| 4 | Vegetables | 5 | 13.00% | 358.4 |
11. PROC MEANS
proc means data=cold_storage_fraud mean min max std;
var Energy_Usage Storage_Days Quality_Index;
run;
OUTPUT:
The MEANS Procedure
| Variable | Mean | Minimum | Maximum | Std Dev |
|---|---|---|---|---|
Energy_Usage Storage_Days Quality_Index | 379.3500000 33.3500000 71.3250000 | 219.0000000 13.0000000 53.5000000 | 517.0000000 49.0000000 94.0000000 | 120.4733100 9.2751791 12.0101657 |
12. PROC UNIVARIATE
proc univariate data=cold_storage_fraud;
var Spoilage_Rate;
histogram Spoilage_Rate;
run;
OUTPUT:
The UNIVARIATE Procedure
Variable: Spoilage_Rate
| Moments | |||
|---|---|---|---|
| N | 20 | Sum Weights | 20 |
| Mean | 0.12 | Sum Observations | 2.4 |
| Std Deviation | 0.0758114 | Variance | 0.00574737 |
| Skewness | 0.49766839 | Kurtosis | -1.614799 |
| Uncorrected SS | 0.3972 | Corrected SS | 0.1092 |
| Coeff Variation | 63.1761669 | Std Error Mean | 0.01695194 |
| Basic Statistical Measures | |||
|---|---|---|---|
| Location | Variability | ||
| Mean | 0.120000 | Std Deviation | 0.07581 |
| Median | 0.080000 | Variance | 0.00575 |
| Mode | 0.070000 | Range | 0.21000 |
| Interquartile Range | 0.15000 | ||
Note: The mode displayed is the smallest of 2 modes with a count of 3.
| Tests for Location: Mu0=0 | ||||
|---|---|---|---|---|
| Test | Statistic | p Value | ||
| Student's t | t | 7.078834 | Pr > |t| | <.0001 |
| Sign | M | 10 | Pr >= |M| | <.0001 |
| Signed Rank | S | 105 | Pr >= |S| | <.0001 |
| Quantiles (Definition 5) | |
|---|---|
| Level | Quantile |
| 100% Max | 0.240 |
| 99% | 0.240 |
| 95% | 0.235 |
| 90% | 0.225 |
| 75% Q3 | 0.210 |
| 50% Median | 0.080 |
| 25% Q1 | 0.060 |
| 10% | 0.045 |
| 5% | 0.035 |
| 1% | 0.030 |
| 0% Min | 0.030 |
| Extreme Observations | |||
|---|---|---|---|
| Lowest | Highest | ||
| Value | Obs | Value | Obs |
| 0.03 | 10 | 0.21 | 17 |
| 0.04 | 11 | 0.21 | 19 |
| 0.05 | 20 | 0.22 | 5 |
| 0.05 | 3 | 0.23 | 7 |
| 0.06 | 16 | 0.24 | 12 |
The UNIVARIATE Procedure
13. PROC FREQ
proc freq data=cold_storage_fraud;
tables Utilization_Class*Fraud_Flag / nocum;
run;
OUTPUT:
The FREQ Procedure
|
| ||||||||||||||||||||||||||||
14. PROC CORR
proc corr data=cold_storage_fraud;
var Temperature_Variation Spoilage_Rate Energy_Usage Quality_Index;
run;
OUTPUT:
The CORR Procedure
| 4 Variables: | Temperature_Variation Spoilage_Rate Energy_Usage Quality_Index |
|---|
| Simple Statistics | ||||||
|---|---|---|---|---|---|---|
| Variable | N | Mean | Std Dev | Sum | Minimum | Maximum |
| Temperature_Variation | 20 | 3.33500 | 1.50132 | 66.70000 | 0.60000 | 5.80000 |
| Spoilage_Rate | 20 | 0.12000 | 0.07581 | 2.40000 | 0.03000 | 0.24000 |
| Energy_Usage | 20 | 379.35000 | 120.47331 | 7587 | 219.00000 | 517.00000 |
| Quality_Index | 20 | 71.32500 | 12.01017 | 1427 | 53.50000 | 94.00000 |
| Pearson Correlation Coefficients, N = 20 Prob > |r| under H0: Rho=0 | ||||
|---|---|---|---|---|
| Temperature_Variation | Spoilage_Rate | Energy_Usage | Quality_Index | |
| Temperature_Variation | 1.00000 | 0.26728 0.2546 | -0.21709 0.3579 | -0.79374 <.0001 |
| Spoilage_Rate | 0.26728 0.2546 | 1.00000 | -0.64190 0.0023 | -0.79828 <.0001 |
| Energy_Usage | -0.21709 0.3579 | -0.64190 0.0023 | 1.00000 | 0.54087 0.0138 |
| Quality_Index | -0.79374 <.0001 | -0.79828 <.0001 | 0.54087 0.0138 | 1.00000 |
15. PROC SGPLOT
proc sgplot data=cold_storage_fraud;
scatter x=Temperature_Variation y=Spoilage_Rate;
reg x=Temperature_Variation y=Spoilage_Rate;
run;
OUTPUT:
16. TRANSPOSE
proc transpose data=storage_summary out=summary_t;
by Product_Type NotSorted;
run;
proc print data=summary_t;
run;
OUTPUT:
| Obs | Product_Type | _NAME_ | COL1 |
|---|---|---|---|
| 1 | Dairy | Units | 5.000 |
| 2 | Dairy | Avg_Spoilage | 0.148 |
| 3 | Dairy | Avg_Energy | 341.200 |
| 4 | Fruits | Units | 5.000 |
| 5 | Fruits | Avg_Spoilage | 0.102 |
| 6 | Fruits | Avg_Energy | 383.600 |
| 7 | Meat | Units | 5.000 |
| 8 | Meat | Avg_Spoilage | 0.100 |
| 9 | Meat | Avg_Energy | 434.200 |
| 10 | Vegetables | Units | 5.000 |
| 11 | Vegetables | Avg_Spoilage | 0.130 |
| 12 | Vegetables | Avg_Energy | 358.400 |
17. CREATE NEW DATA & APPEND
data storage_summary_new;
set storage_summary;
Report_Month = "NEXT_MONTH";
run;
proc print data=storage_summary_new;
run;
OUTPUT:
| Obs | Product_Type | Units | Avg_Spoilage | Avg_Energy | Report_Month |
|---|---|---|---|---|---|
| 1 | Dairy | 5 | 14.80% | 341.2 | NEXT_MONTH |
| 2 | Fruits | 5 | 10.20% | 383.6 | NEXT_MONTH |
| 3 | Meat | 5 | 10.00% | 434.2 | NEXT_MONTH |
| 4 | Vegetables | 5 | 13.00% | 358.4 | NEXT_MONTH |
proc append base=storage_summary
data=storage_summary_new force;
run;
proc print data=storage_summary;
run;
OUTPUT:
| Obs | Product_Type | Units | Avg_Spoilage | Avg_Energy |
|---|---|---|---|---|
| 1 | Dairy | 5 | 14.80% | 341.2 |
| 2 | Fruits | 5 | 10.20% | 383.6 |
| 3 | Meat | 5 | 10.00% | 434.2 |
| 4 | Vegetables | 5 | 13.00% | 358.4 |
| 5 | Dairy | 5 | 14.80% | 341.2 |
| 6 | Fruits | 5 | 10.20% | 383.6 |
| 7 | Meat | 5 | 10.00% | 434.2 |
| 8 | Vegetables | 5 | 13.00% | 358.4 |
18. MERGE & SET
proc sort data=cold_storage_fraud;by Product_Type;run;
proc print data=cold_storage_fraud;
run;
OUTPUT:
| Obs | Storage_ID | Product_Type | Start_Date | End_Date | Spoilage_Rate | Temperature_Variation | Energy_Usage | Storage_Days | Quality_Index | Product_Type_Clean | Storage_ID_Upper | Storage_ID_Lower | Label | Storage_Months | Review_Date | Utilization_Class | Fraud_Flag |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | CS003 | Dairy | 16JAN2024 | 13FEB2024 | 5.00% | 0.7 | 510 | 28 | 91.5 | Dairy | CS003 | cs003 | CS003-Dai | 1 | 23448 | Low | NO |
| 2 | CS007 | Dairy | 05FEB2024 | 11MAR2024 | 23.00% | 3.1 | 219 | 35 | 61.5 | Dairy | CS007 | cs007 | CS007-Dai | 1 | 23477 | Low | NO |
| 3 | CS011 | Dairy | 25FEB2024 | 14APR2024 | 4.00% | 2.9 | 500 | 49 | 81.5 | Dairy | CS011 | cs011 | CS011-Dai | 2 | 23510 | High | NO |
| 4 | CS015 | Dairy | 16MAR2024 | 29MAR2024 | 21.00% | 4.2 | 239 | 13 | 58.0 | Dairy | CS015 | cs015 | CS015-Dai | 0 | 23495 | Low | YES |
| 5 | CS019 | Dairy | 05APR2024 | 06MAY2024 | 21.00% | 4.5 | 238 | 31 | 56.5 | Dairy | CS019 | cs019 | CS019-Dai | 1 | 23533 | Low | YES |
| 6 | CS004 | Fruits | 21JAN2024 | 23FEB2024 | 7.00% | 5.8 | 486 | 33 | 64.0 | Fruits | CS004 | cs004 | CS004-Fru | 1 | 23458 | High | YES |
| 7 | CS008 | Fruits | 10FEB2024 | 04MAR2024 | 9.00% | 3.9 | 435 | 23 | 71.5 | Fruits | CS008 | cs008 | CS008-Fru | 1 | 23470 | Medium | NO |
| 8 | CS012 | Fruits | 01MAR2024 | 08APR2024 | 24.00% | 4.5 | 238 | 38 | 53.5 | Fruits | CS012 | cs012 | CS012-Fru | 1 | 23504 | Low | YES |
| 9 | CS016 | Fruits | 21MAR2024 | 28APR2024 | 6.00% | 1.5 | 517 | 38 | 86.5 | Fruits | CS016 | cs016 | CS016-Fru | 1 | 23524 | High | NO |
| 10 | CS020 | Fruits | 10APR2024 | 29APR2024 | 5.00% | 2.1 | 242 | 19 | 84.5 | Fruits | CS020 | cs020 | CS020-Fru | 0 | 23525 | Low | NO |
| 11 | CS002 | Meat | 11JAN2024 | 06FEB2024 | 6.00% | 2.1 | 515 | 26 | 83.5 | Meat | CS002 | cs002 | CS002-Mea | 1 | 23441 | Low | NO |
| 12 | CS006 | Meat | 31JAN2024 | 24FEB2024 | 19.00% | 3.9 | 481 | 24 | 61.5 | Meat | CS006 | cs006 | CS006-Mea | 1 | 23459 | Low | NO |
| 13 | CS010 | Meat | 20FEB2024 | 28MAR2024 | 3.00% | 0.6 | 433 | 37 | 94.0 | Meat | CS010 | cs010 | CS010-Mea | 1 | 23494 | Medium | NO |
| 14 | CS014 | Meat | 11MAR2024 | 12APR2024 | 8.00% | 4.8 | 512 | 32 | 68.0 | Meat | CS014 | cs014 | CS014-Mea | 1 | 23508 | High | YES |
| 15 | CS018 | Meat | 31MAR2024 | 06MAY2024 | 14.00% | 4.2 | 230 | 36 | 65.0 | Meat | CS018 | cs018 | CS018-Mea | 2 | 23533 | Low | YES |
| 16 | CS001 | Vegetables | 06JAN2024 | 22FEB2024 | 8.00% | 4.5 | 271 | 47 | 69.5 | Vegetables | CS001 | cs001 | CS001-Veg | 1 | 23457 | Low | YES |
| 17 | CS005 | Vegetables | 26JAN2024 | 11MAR2024 | 22.00% | 3.2 | 257 | 45 | 62.0 | Vegetables | CS005 | cs005 | CS005-Veg | 2 | 23477 | Low | NO |
| 18 | CS009 | Vegetables | 15FEB2024 | 19MAR2024 | 7.00% | 3.2 | 411 | 33 | 77.0 | Vegetables | CS009 | cs009 | CS009-Veg | 1 | 23485 | Medium | NO |
| 19 | CS013 | Vegetables | 06MAR2024 | 17APR2024 | 7.00% | 5.4 | 464 | 42 | 66.0 | Vegetables | CS013 | cs013 | CS013-Veg | 1 | 23513 | High | YES |
| 20 | CS017 | Vegetables | 26MAR2024 | 03MAY2024 | 21.00% | 1.6 | 389 | 38 | 71.0 | Vegetables | CS017 | cs017 | CS017-Veg | 2 | 23530 | Medium | NO |
proc sort data=storage_summary;by Product_Type;run;
proc print data=storage_summary;
run;
OUTPUT:
| Obs | Product_Type | Units | Avg_Spoilage | Avg_Energy |
|---|---|---|---|---|
| 1 | Dairy | 5 | 14.80% | 341.2 |
| 2 | Dairy | 5 | 14.80% | 341.2 |
| 3 | Fruits | 5 | 10.20% | 383.6 |
| 4 | Fruits | 5 | 10.20% | 383.6 |
| 5 | Meat | 5 | 10.00% | 434.2 |
| 6 | Meat | 5 | 10.00% | 434.2 |
| 7 | Vegetables | 5 | 13.00% | 358.4 |
| 8 | Vegetables | 5 | 13.00% | 358.4 |
data merged_data;
merge cold_storage_fraud
storage_summary;
by Product_Type;
run;
proc print data=merged_data;
run;
OUTPUT:
| Obs | Storage_ID | Product_Type | Start_Date | End_Date | Spoilage_Rate | Temperature_Variation | Energy_Usage | Storage_Days | Quality_Index | Product_Type_Clean | Storage_ID_Upper | Storage_ID_Lower | Label | Storage_Months | Review_Date | Utilization_Class | Fraud_Flag | Units | Avg_Spoilage | Avg_Energy |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | CS003 | Dairy | 16JAN2024 | 13FEB2024 | 5.00% | 0.7 | 510 | 28 | 91.5 | Dairy | CS003 | cs003 | CS003-Dai | 1 | 23448 | Low | NO | 5 | 14.80% | 341.2 |
| 2 | CS007 | Dairy | 05FEB2024 | 11MAR2024 | 23.00% | 3.1 | 219 | 35 | 61.5 | Dairy | CS007 | cs007 | CS007-Dai | 1 | 23477 | Low | NO | 5 | 14.80% | 341.2 |
| 3 | CS011 | Dairy | 25FEB2024 | 14APR2024 | 4.00% | 2.9 | 500 | 49 | 81.5 | Dairy | CS011 | cs011 | CS011-Dai | 2 | 23510 | High | NO | 5 | 14.80% | 341.2 |
| 4 | CS015 | Dairy | 16MAR2024 | 29MAR2024 | 21.00% | 4.2 | 239 | 13 | 58.0 | Dairy | CS015 | cs015 | CS015-Dai | 0 | 23495 | Low | YES | 5 | 14.80% | 341.2 |
| 5 | CS019 | Dairy | 05APR2024 | 06MAY2024 | 21.00% | 4.5 | 238 | 31 | 56.5 | Dairy | CS019 | cs019 | CS019-Dai | 1 | 23533 | Low | YES | 5 | 14.80% | 341.2 |
| 6 | CS004 | Fruits | 21JAN2024 | 23FEB2024 | 7.00% | 5.8 | 486 | 33 | 64.0 | Fruits | CS004 | cs004 | CS004-Fru | 1 | 23458 | High | YES | 5 | 10.20% | 383.6 |
| 7 | CS008 | Fruits | 10FEB2024 | 04MAR2024 | 9.00% | 3.9 | 435 | 23 | 71.5 | Fruits | CS008 | cs008 | CS008-Fru | 1 | 23470 | Medium | NO | 5 | 10.20% | 383.6 |
| 8 | CS012 | Fruits | 01MAR2024 | 08APR2024 | 24.00% | 4.5 | 238 | 38 | 53.5 | Fruits | CS012 | cs012 | CS012-Fru | 1 | 23504 | Low | YES | 5 | 10.20% | 383.6 |
| 9 | CS016 | Fruits | 21MAR2024 | 28APR2024 | 6.00% | 1.5 | 517 | 38 | 86.5 | Fruits | CS016 | cs016 | CS016-Fru | 1 | 23524 | High | NO | 5 | 10.20% | 383.6 |
| 10 | CS020 | Fruits | 10APR2024 | 29APR2024 | 5.00% | 2.1 | 242 | 19 | 84.5 | Fruits | CS020 | cs020 | CS020-Fru | 0 | 23525 | Low | NO | 5 | 10.20% | 383.6 |
| 11 | CS002 | Meat | 11JAN2024 | 06FEB2024 | 6.00% | 2.1 | 515 | 26 | 83.5 | Meat | CS002 | cs002 | CS002-Mea | 1 | 23441 | Low | NO | 5 | 10.00% | 434.2 |
| 12 | CS006 | Meat | 31JAN2024 | 24FEB2024 | 19.00% | 3.9 | 481 | 24 | 61.5 | Meat | CS006 | cs006 | CS006-Mea | 1 | 23459 | Low | NO | 5 | 10.00% | 434.2 |
| 13 | CS010 | Meat | 20FEB2024 | 28MAR2024 | 3.00% | 0.6 | 433 | 37 | 94.0 | Meat | CS010 | cs010 | CS010-Mea | 1 | 23494 | Medium | NO | 5 | 10.00% | 434.2 |
| 14 | CS014 | Meat | 11MAR2024 | 12APR2024 | 8.00% | 4.8 | 512 | 32 | 68.0 | Meat | CS014 | cs014 | CS014-Mea | 1 | 23508 | High | YES | 5 | 10.00% | 434.2 |
| 15 | CS018 | Meat | 31MAR2024 | 06MAY2024 | 14.00% | 4.2 | 230 | 36 | 65.0 | Meat | CS018 | cs018 | CS018-Mea | 2 | 23533 | Low | YES | 5 | 10.00% | 434.2 |
| 16 | CS001 | Vegetables | 06JAN2024 | 22FEB2024 | 8.00% | 4.5 | 271 | 47 | 69.5 | Vegetables | CS001 | cs001 | CS001-Veg | 1 | 23457 | Low | YES | 5 | 13.00% | 358.4 |
| 17 | CS005 | Vegetables | 26JAN2024 | 11MAR2024 | 22.00% | 3.2 | 257 | 45 | 62.0 | Vegetables | CS005 | cs005 | CS005-Veg | 2 | 23477 | Low | NO | 5 | 13.00% | 358.4 |
| 18 | CS009 | Vegetables | 15FEB2024 | 19MAR2024 | 7.00% | 3.2 | 411 | 33 | 77.0 | Vegetables | CS009 | cs009 | CS009-Veg | 1 | 23485 | Medium | NO | 5 | 13.00% | 358.4 |
| 19 | CS013 | Vegetables | 06MAR2024 | 17APR2024 | 7.00% | 5.4 | 464 | 42 | 66.0 | Vegetables | CS013 | cs013 | CS013-Veg | 1 | 23513 | High | YES | 5 | 13.00% | 358.4 |
| 20 | CS017 | Vegetables | 26MAR2024 | 03MAY2024 | 21.00% | 1.6 | 389 | 38 | 71.0 | Vegetables | CS017 | cs017 | CS017-Veg | 2 | 23530 | Medium | NO | 5 | 13.00% | 358.4 |
19. PROC DATASETS DELETE
proc datasets library=work;
delete cold_storage_raw;
quit;
LOG:
20. BUSINESS INSIGHTS
- Temperature variation directly increases spoilage
- Energy inefficiency correlates with longer storage
- Fraud patterns appear in unrealistic combinations
- Product type significantly affects quality retention
21. INTERVIEW VALUE
You can confidently explain:
- Why macros were used
- Why SQL was better than DATA step
- How fraud logic mirrors real audits
- How date intelligence supports scheduling
- How plots communicate business risks
22. Conclusion
This project proves that:
· Simple data can reveal powerful truths
· SAS is not just programming—it’s decision support
· Cold storage analytics protects money, quality, and trust
By combining:
· Data steps
· SQL
· Macros
· Statistics
· Visualization
INTERVIEW QUESTIONS FOR YOU
· What is the difference between LENGTH and FORMAT in SAS?
· What is the difference between MISSING and 0 in SAS?
· What are temporary and permanent SAS datasets?
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
About the Author:
SAS Learning Hub is a data analytics and SAS programming platform focused on clinical, financial, and real-world data analysis. The content is created by professionals with academic training in Pharmaceutics and hands-on experience in Base SAS, PROC SQL, Macros, SDTM, and ADaM, providing practical and industry-relevant SAS learning resources.
Disclaimer:
The datasets and analysis in this article are created for educational and demonstration purposes only. They do not represent COLD STORAGE data.
Our Mission:
This blog provides industry-focused SAS programming tutorials and analytics projects covering finance, healthcare, and technology.
This project is suitable for:
· Students learning SAS
· Data analysts building portfolios
· Professionals preparing for SAS interviews
· Bloggers writing about analytics and smart cities
· EV and energy industry professionals
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Follow Us On :
To deepen your understanding of SAS analytics, please refer to our other data science and industry-focused projects listed below:
1. Are We Losing Wildlife Faster Than We Think? – Animal Conservation Analysis Using SAS
2.Can Modern Art Data Explain What Actually Sells? – A Real-World SAS Analytics Project
3.Which Vehicles Truly Perform Better on Indian Roads? – A Real-World SAS Analytics Project
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Comments
Post a Comment