393.Can Cold Storage Data Reveal Hidden Losses, Wastage, and Fraud? – A Complete SAS Analytics Project

Can Cold Storage Data Reveal Hidden Losses, Wastage, and Fraud? – A Complete SAS Analytics Project

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

HERE IN THIS PROJECT WE USED THESE SAS STATEMENTS —
DATA STEP | PROC SQL |  PROC PRINT | PROC SGPLOT | MACROS | PROC CORR | PROC MEANS | PROC FREQ | PROC UNIVARIATE | APPEND | MERGE | PROC TRANSPOSE | PROC DATASETS DELETE | DATA FUNCTIONS

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

TABLE OF CONTENTS

1.     Introduction

2.     Business Context

3.     Project Objectives

4.     Dataset Design

5.     Data Creation Using Data Step

6.     Data Cleaning and Numeric and Character Functions Usage

7.     Date Handling Strategy

8.     Utilization Classification Macro

9.     Fraud Detection Macro

10.  PROC SQL Analysis

11.  PROC MEANS Analysis

12.  PROC UNIVARIATE Analysis

13.  PROC FREQ Analysis

14.  PROC CORR Analysis

15.  PROC SGPLOT Visualization

16.  APPEND, SET, MERGE, and TRANSPOSE Usage

17.  PROC DATASETS Cleanup

18.  Business Insights

19.  Interview-Oriented Explanations

20.  Conclusion

1.1.Introduction

In real life, cold storage facilities play a vital role in:

  • Food supply chains
  • Agriculture
  • Dairy and meat preservation
  • Pharmaceutical storage

A small mistake in temperature controlenergy reporting, or date tracking can lead to:

  • Heavy financial loss
  • Product quality degradation
  • False reporting
  • Even fraud

2. Business Context

Imagine a company managing multiple cold storage units across locations.

Each unit stores different products:

·       Fruits

·       Vegetables

·       Meat

·       Dairy

Management wants answers to simple but powerful questions:

·       Are storage units being used efficiently?

·       Is high energy usage really justified?

·       Why is spoilage low even when temperature fluctuates a lot?

·       Are storage dates realistic or manipulated?

·       Which product types lose quality faster?

3. PROJECT OBJECTIVES

1.Create a cold storage dataset with realistic variables
2.Apply date functions (MDY, INTCK, INTNX)
3.Classify storage utilization using macros
4.Detect fraud patterns using macros
5.Perform statistical profiling
6.Apply string manipulation functions
7.Use SET, MERGE, APPEND, TRANSPOSE
8.Generate graphs
9.Clean datasets using PROC DATASETS

4. DATASET DESIGN

Variables Used

Variable

Description

Storage_ID

Unique cold storage unit

Product_Type

Fruits, Vegetables, Meat, Dairy

Temperature_Variation

°C fluctuation

Spoilage_Rate

% spoiled

Energy_Usage

kWh/day

Storage_Days

Days stored

Quality_Index

Derived score

Start_Date

Storage start

End_Date

Storage end

Utilization_Class

Macro derived

Fraud_Flag

Macro derived

 

5. DATA CREATION USING DATA STEP

data cold_storage_raw;

    length Storage_ID $6 Product_Type $15;

    format Start_Date End_Date date9. Spoilage_Rate percent8.2;

    do i = 1 to 20;

        Storage_ID = cats("CS", put(i, z3.));

        Product_Type = scan("Fruits Vegetables Meat Dairy", mod(i,4)+1);

        Temperature_Variation = round(ranuni(123)*6,0.1);

        Spoilage_Rate = round(ranuni(321)*0.25,0.01);

        Energy_Usage = round(200 + ranuni(456)*400,1);

        Storage_Days = ceil(10 + ranuni(789)*40);

        Quality_Index = round(100 - (Temperature_Variation*5 + Spoilage_Rate*100),0.1);

        Start_Date = mdy(1,1,2024) + i*5;

        End_Date = Start_Date + Storage_Days;

        output;

    end;

    drop i;

run;

proc print data=cold_storage_raw;

run;

OUTPUT:

ObsStorage_IDProduct_TypeStart_DateEnd_DateSpoilage_RateTemperature_VariationEnergy_UsageStorage_DaysQuality_Index
1CS001Vegetables06JAN202422FEB20248.00%4.52714769.5
2CS002Meat11JAN202406FEB20246.00%2.15152683.5
3CS003Dairy16JAN202413FEB20245.00%0.75102891.5
4CS004Fruits21JAN202423FEB20247.00%5.84863364.0
5CS005Vegetables26JAN202411MAR202422.00%3.22574562.0
6CS006Meat31JAN202424FEB202419.00%3.94812461.5
7CS007Dairy05FEB202411MAR202423.00%3.12193561.5
8CS008Fruits10FEB202404MAR20249.00%3.94352371.5
9CS009Vegetables15FEB202419MAR20247.00%3.24113377.0
10CS010Meat20FEB202428MAR20243.00%0.64333794.0
11CS011Dairy25FEB202414APR20244.00%2.95004981.5
12CS012Fruits01MAR202408APR202424.00%4.52383853.5
13CS013Vegetables06MAR202417APR20247.00%5.44644266.0
14CS014Meat11MAR202412APR20248.00%4.85123268.0
15CS015Dairy16MAR202429MAR202421.00%4.22391358.0
16CS016Fruits21MAR202428APR20246.00%1.55173886.5
17CS017Vegetables26MAR202403MAY202421.00%1.63893871.0
18CS018Meat31MAR202406MAY202414.00%4.22303665.0
19CS019Dairy05APR202406MAY202421.00%4.52383156.5
20CS020Fruits10APR202429APR20245.00%2.12421984.5

·  Simulates 20 cold storage units

·  Uses MDY() for date creation

·  Uses randomization to simulate real-world variability

·  Quality_Index decreases with temperature fluctuation and spoilage

6. DATA CLEANING & CHARACTER FUNCTIONS

data cold_storage_clean;

    set cold_storage_raw;

    Product_Type_Clean = propcase(strip(Product_Type));

    Storage_ID_Upper = upcase(Storage_ID);

    Storage_ID_Lower = lowcase(Storage_ID);

    Label = catx("-", Storage_ID, substr(Product_Type,1,3));

run;

proc print data=cold_storage_clean;

run;

OUTPUT:

ObsStorage_IDProduct_TypeStart_DateEnd_DateSpoilage_RateTemperature_VariationEnergy_UsageStorage_DaysQuality_IndexProduct_Type_CleanStorage_ID_UpperStorage_ID_LowerLabel
1CS001Vegetables06JAN202422FEB20248.00%4.52714769.5VegetablesCS001cs001CS001-Veg
2CS002Meat11JAN202406FEB20246.00%2.15152683.5MeatCS002cs002CS002-Mea
3CS003Dairy16JAN202413FEB20245.00%0.75102891.5DairyCS003cs003CS003-Dai
4CS004Fruits21JAN202423FEB20247.00%5.84863364.0FruitsCS004cs004CS004-Fru
5CS005Vegetables26JAN202411MAR202422.00%3.22574562.0VegetablesCS005cs005CS005-Veg
6CS006Meat31JAN202424FEB202419.00%3.94812461.5MeatCS006cs006CS006-Mea
7CS007Dairy05FEB202411MAR202423.00%3.12193561.5DairyCS007cs007CS007-Dai
8CS008Fruits10FEB202404MAR20249.00%3.94352371.5FruitsCS008cs008CS008-Fru
9CS009Vegetables15FEB202419MAR20247.00%3.24113377.0VegetablesCS009cs009CS009-Veg
10CS010Meat20FEB202428MAR20243.00%0.64333794.0MeatCS010cs010CS010-Mea
11CS011Dairy25FEB202414APR20244.00%2.95004981.5DairyCS011cs011CS011-Dai
12CS012Fruits01MAR202408APR202424.00%4.52383853.5FruitsCS012cs012CS012-Fru
13CS013Vegetables06MAR202417APR20247.00%5.44644266.0VegetablesCS013cs013CS013-Veg
14CS014Meat11MAR202412APR20248.00%4.85123268.0MeatCS014cs014CS014-Mea
15CS015Dairy16MAR202429MAR202421.00%4.22391358.0DairyCS015cs015CS015-Dai
16CS016Fruits21MAR202428APR20246.00%1.55173886.5FruitsCS016cs016CS016-Fru
17CS017Vegetables26MAR202403MAY202421.00%1.63893871.0VegetablesCS017cs017CS017-Veg
18CS018Meat31MAR202406MAY202414.00%4.22303665.0MeatCS018cs018CS018-Mea
19CS019Dairy05APR202406MAY202421.00%4.52383156.5DairyCS019cs019CS019-Dai
20CS020Fruits10APR202429APR20245.00%2.12421984.5FruitsCS020cs020CS020-Fru

·  STRIP/TRIM → removes spaces

·  PROPCASE → business readable text

·  COALESCE → avoids missing categories

·  CATX → clean concatenation

7. DATE INTELLIGENCE

data cold_storage_dates;

    set cold_storage_clean;

    Storage_Months = intck('month', Start_Date, End_Date);

    Review_Date = intnx('month', End_Date, 1, 's');

run;

proc print data=cold_storage_dates;

run;

OUTPUT:

ObsStorage_IDProduct_TypeStart_DateEnd_DateSpoilage_RateTemperature_VariationEnergy_UsageStorage_DaysQuality_IndexProduct_Type_CleanStorage_ID_UpperStorage_ID_LowerLabelStorage_MonthsReview_Date
1CS001Vegetables06JAN202422FEB20248.00%4.52714769.5VegetablesCS001cs001CS001-Veg123457
2CS002Meat11JAN202406FEB20246.00%2.15152683.5MeatCS002cs002CS002-Mea123441
3CS003Dairy16JAN202413FEB20245.00%0.75102891.5DairyCS003cs003CS003-Dai123448
4CS004Fruits21JAN202423FEB20247.00%5.84863364.0FruitsCS004cs004CS004-Fru123458
5CS005Vegetables26JAN202411MAR202422.00%3.22574562.0VegetablesCS005cs005CS005-Veg223477
6CS006Meat31JAN202424FEB202419.00%3.94812461.5MeatCS006cs006CS006-Mea123459
7CS007Dairy05FEB202411MAR202423.00%3.12193561.5DairyCS007cs007CS007-Dai123477
8CS008Fruits10FEB202404MAR20249.00%3.94352371.5FruitsCS008cs008CS008-Fru123470
9CS009Vegetables15FEB202419MAR20247.00%3.24113377.0VegetablesCS009cs009CS009-Veg123485
10CS010Meat20FEB202428MAR20243.00%0.64333794.0MeatCS010cs010CS010-Mea123494
11CS011Dairy25FEB202414APR20244.00%2.95004981.5DairyCS011cs011CS011-Dai223510
12CS012Fruits01MAR202408APR202424.00%4.52383853.5FruitsCS012cs012CS012-Fru123504
13CS013Vegetables06MAR202417APR20247.00%5.44644266.0VegetablesCS013cs013CS013-Veg123513
14CS014Meat11MAR202412APR20248.00%4.85123268.0MeatCS014cs014CS014-Mea123508
15CS015Dairy16MAR202429MAR202421.00%4.22391358.0DairyCS015cs015CS015-Dai023495
16CS016Fruits21MAR202428APR20246.00%1.55173886.5FruitsCS016cs016CS016-Fru123524
17CS017Vegetables26MAR202403MAY202421.00%1.63893871.0VegetablesCS017cs017CS017-Veg223530
18CS018Meat31MAR202406MAY202414.00%4.22303665.0MeatCS018cs018CS018-Mea223533
19CS019Dairy05APR202406MAY202421.00%4.52383156.5DairyCS019cs019CS019-Dai123533
20CS020Fruits10APR202429APR20245.00%2.12421984.5FruitsCS020cs020CS020-Fru023525

·  INTCK → duration analysis

·  INTNX → future review scheduling

8. UTILIZATION CLASSIFICATION MACRO

%macro utilization;

data cold_storage_util;

    set cold_storage_dates;

length Utilization_Class $8.;

    if Energy_Usage > 450 and Storage_Days > 30 then Utilization_Class="High";

    else if 300 < Energy_Usage < 450 then Utilization_Class="Medium";

    else Utilization_Class="Low";

run;

proc print data=cold_storage_util;

run;

%mend;


%utilization;

OUTPUT:

ObsStorage_IDProduct_TypeStart_DateEnd_DateSpoilage_RateTemperature_VariationEnergy_UsageStorage_DaysQuality_IndexProduct_Type_CleanStorage_ID_UpperStorage_ID_LowerLabelStorage_MonthsReview_DateUtilization_Class
1CS001Vegetables06JAN202422FEB20248.00%4.52714769.5VegetablesCS001cs001CS001-Veg123457Low
2CS002Meat11JAN202406FEB20246.00%2.15152683.5MeatCS002cs002CS002-Mea123441Low
3CS003Dairy16JAN202413FEB20245.00%0.75102891.5DairyCS003cs003CS003-Dai123448Low
4CS004Fruits21JAN202423FEB20247.00%5.84863364.0FruitsCS004cs004CS004-Fru123458High
5CS005Vegetables26JAN202411MAR202422.00%3.22574562.0VegetablesCS005cs005CS005-Veg223477Low
6CS006Meat31JAN202424FEB202419.00%3.94812461.5MeatCS006cs006CS006-Mea123459Low
7CS007Dairy05FEB202411MAR202423.00%3.12193561.5DairyCS007cs007CS007-Dai123477Low
8CS008Fruits10FEB202404MAR20249.00%3.94352371.5FruitsCS008cs008CS008-Fru123470Medium
9CS009Vegetables15FEB202419MAR20247.00%3.24113377.0VegetablesCS009cs009CS009-Veg123485Medium
10CS010Meat20FEB202428MAR20243.00%0.64333794.0MeatCS010cs010CS010-Mea123494Medium
11CS011Dairy25FEB202414APR20244.00%2.95004981.5DairyCS011cs011CS011-Dai223510High
12CS012Fruits01MAR202408APR202424.00%4.52383853.5FruitsCS012cs012CS012-Fru123504Low
13CS013Vegetables06MAR202417APR20247.00%5.44644266.0VegetablesCS013cs013CS013-Veg123513High
14CS014Meat11MAR202412APR20248.00%4.85123268.0MeatCS014cs014CS014-Mea123508High
15CS015Dairy16MAR202429MAR202421.00%4.22391358.0DairyCS015cs015CS015-Dai023495Low
16CS016Fruits21MAR202428APR20246.00%1.55173886.5FruitsCS016cs016CS016-Fru123524High
17CS017Vegetables26MAR202403MAY202421.00%1.63893871.0VegetablesCS017cs017CS017-Veg223530Medium
18CS018Meat31MAR202406MAY202414.00%4.22303665.0MeatCS018cs018CS018-Mea223533Low
19CS019Dairy05APR202406MAY202421.00%4.52383156.5DairyCS019cs019CS019-Dai123533Low
20CS020Fruits10APR202429APR20245.00%2.12421984.5FruitsCS020cs020CS020-Fru023525Low

·  High energy + long duration = High utilization

·  Medium usage = Moderate

·  Low usage = Underutilized

9. FRAUD DETECTION MACRO

%macro fraud_check;

data cold_storage_fraud;

    set cold_storage_util;

    if Spoilage_Rate < 8.0 and Temperature_Variation > 4 then Fraud_Flag="YES";

    else if Quality_Index > 95 and Spoilage_Rate > 0.20 then Fraud_Flag="YES";

    else Fraud_Flag="NO";

run;

proc print data=cold_storage_fraud;

var Storage_ID Product_Type Spoilage_Rate Temperature_Variation Quality_Index Fraud_Flag;

run;

%mend;


%fraud_check;

OUTPUT:

ObsStorage_IDProduct_TypeSpoilage_RateTemperature_VariationQuality_IndexFraud_Flag
1CS001Vegetables8.00%4.569.5YES
2CS002Meat6.00%2.183.5NO
3CS003Dairy5.00%0.791.5NO
4CS004Fruits7.00%5.864.0YES
5CS005Vegetables22.00%3.262.0NO
6CS006Meat19.00%3.961.5NO
7CS007Dairy23.00%3.161.5NO
8CS008Fruits9.00%3.971.5NO
9CS009Vegetables7.00%3.277.0NO
10CS010Meat3.00%0.694.0NO
11CS011Dairy4.00%2.981.5NO
12CS012Fruits24.00%4.553.5YES
13CS013Vegetables7.00%5.466.0YES
14CS014Meat8.00%4.868.0YES
15CS015Dairy21.00%4.258.0YES
16CS016Fruits6.00%1.586.5NO
17CS017Vegetables21.00%1.671.0NO
18CS018Meat14.00%4.265.0YES
19CS019Dairy21.00%4.556.5YES
20CS020Fruits5.00%2.184.5NO

Fraud Scenarios

·       High temperature but very low spoilage

·       Very high quality but high spoilage

10. PROC SQL – BUSINESS QUERIES

proc sql;

    create table storage_summary as

    select Product_Type,

           count(*) as Units,

           mean(Spoilage_Rate) as Avg_Spoilage format=percent8.2,

           mean(Energy_Usage) as Avg_Energy

    from cold_storage_fraud

    group by Product_Type;

quit;

proc print data=storage_summary;

run;

OUTPUT:

ObsProduct_TypeUnitsAvg_SpoilageAvg_Energy
1Dairy514.80%341.2
2Fruits510.20%383.6
3Meat510.00%434.2
4Vegetables513.00%358.4

11. PROC MEANS

proc means data=cold_storage_fraud mean min max std;

    var Energy_Usage Storage_Days Quality_Index;

run;

OUTPUT:

The MEANS Procedure

VariableMeanMinimumMaximumStd Dev
Energy_Usage
Storage_Days
Quality_Index
379.3500000
33.3500000
71.3250000
219.0000000
13.0000000
53.5000000
517.0000000
49.0000000
94.0000000
120.4733100
9.2751791
12.0101657

12. PROC UNIVARIATE

proc univariate data=cold_storage_fraud;

    var Spoilage_Rate;

    histogram Spoilage_Rate;

run;

OUTPUT:

The UNIVARIATE Procedure

Variable: Spoilage_Rate

Moments
N20Sum Weights20
Mean0.12Sum Observations2.4
Std Deviation0.0758114Variance0.00574737
Skewness0.49766839Kurtosis-1.614799
Uncorrected SS0.3972Corrected SS0.1092
Coeff Variation63.1761669Std Error Mean0.01695194
Basic Statistical Measures
LocationVariability
Mean0.120000Std Deviation0.07581
Median0.080000Variance0.00575
Mode0.070000Range0.21000
  Interquartile Range0.15000

Note: The mode displayed is the smallest of 2 modes with a count of 3.

Tests for Location: Mu0=0
TestStatisticp Value
Student's tt7.078834Pr > |t|<.0001
SignM10Pr >= |M|<.0001
Signed RankS105Pr >= |S|<.0001
Quantiles (Definition 5)
LevelQuantile
100% Max0.240
99%0.240
95%0.235
90%0.225
75% Q30.210
50% Median0.080
25% Q10.060
10%0.045
5%0.035
1%0.030
0% Min0.030
Extreme Observations
LowestHighest
ValueObsValueObs
0.03100.2117
0.04110.2119
0.05200.225
0.0530.237
0.06160.2412

The UNIVARIATE Procedure

Histogram for Spoilage_Rate

13. PROC FREQ

proc freq data=cold_storage_fraud;

    tables Utilization_Class*Fraud_Flag / nocum;

run;

OUTPUT:

The FREQ Procedure

Frequency
Percent
Row Pct
Col Pct
Table of Utilization_Class by Fraud_Flag
Utilization_ClassFraud_Flag
NOYESTotal
High
2
10.00
40.00
16.67
3
15.00
60.00
37.50
5
25.00
 
 
Low
6
30.00
54.55
50.00
5
25.00
45.45
62.50
11
55.00
 
 
Medium
4
20.00
100.00
33.33
0
0.00
0.00
0.00
4
20.00
 
 
Total
12
60.00
8
40.00
20
100.00

14. PROC CORR

proc corr data=cold_storage_fraud;

    var Temperature_Variation Spoilage_Rate Energy_Usage Quality_Index;

run;

OUTPUT:

The CORR Procedure

4 Variables:Temperature_Variation Spoilage_Rate Energy_Usage Quality_Index
Simple Statistics
VariableNMeanStd DevSumMinimumMaximum
Temperature_Variation203.335001.5013266.700000.600005.80000
Spoilage_Rate200.120000.075812.400000.030000.24000
Energy_Usage20379.35000120.473317587219.00000517.00000
Quality_Index2071.3250012.01017142753.5000094.00000
Pearson Correlation Coefficients, N = 20
Prob > |r| under H0: Rho=0
 Temperature_VariationSpoilage_RateEnergy_UsageQuality_Index
Temperature_Variation
1.00000
 
0.26728
0.2546
-0.21709
0.3579
-0.79374
<.0001
Spoilage_Rate
0.26728
0.2546
1.00000
 
-0.64190
0.0023
-0.79828
<.0001
Energy_Usage
-0.21709
0.3579
-0.64190
0.0023
1.00000
 
0.54087
0.0138
Quality_Index
-0.79374
<.0001
-0.79828
<.0001
0.54087
0.0138
1.00000
 

15. PROC SGPLOT

proc sgplot data=cold_storage_fraud;

    scatter x=Temperature_Variation y=Spoilage_Rate;

    reg x=Temperature_Variation y=Spoilage_Rate;

run;

OUTPUT:

The SGPlot Procedure

16. TRANSPOSE

proc transpose data=storage_summary out=summary_t;

    by Product_Type NotSorted;

run;

proc print data=summary_t;

run;

OUTPUT:

ObsProduct_Type_NAME_COL1
1DairyUnits5.000
2DairyAvg_Spoilage0.148
3DairyAvg_Energy341.200
4FruitsUnits5.000
5FruitsAvg_Spoilage0.102
6FruitsAvg_Energy383.600
7MeatUnits5.000
8MeatAvg_Spoilage0.100
9MeatAvg_Energy434.200
10VegetablesUnits5.000
11VegetablesAvg_Spoilage0.130
12VegetablesAvg_Energy358.400

17. CREATE NEW DATA & APPEND

data storage_summary_new;

    set storage_summary;

    Report_Month = "NEXT_MONTH";

run;

proc print data=storage_summary_new;

run;

OUTPUT:

ObsProduct_TypeUnitsAvg_SpoilageAvg_EnergyReport_Month
1Dairy514.80%341.2NEXT_MONTH
2Fruits510.20%383.6NEXT_MONTH
3Meat510.00%434.2NEXT_MONTH
4Vegetables513.00%358.4NEXT_MONTH

proc append base=storage_summary

            data=storage_summary_new force;

run;

proc print data=storage_summary;

run;

OUTPUT:

ObsProduct_TypeUnitsAvg_SpoilageAvg_Energy
1Dairy514.80%341.2
2Fruits510.20%383.6
3Meat510.00%434.2
4Vegetables513.00%358.4
5Dairy514.80%341.2
6Fruits510.20%383.6
7Meat510.00%434.2
8Vegetables513.00%358.4

18. MERGE & SET

proc sort data=cold_storage_fraud;by Product_Type;run;

proc print data=cold_storage_fraud;

run;

OUTPUT:

ObsStorage_IDProduct_TypeStart_DateEnd_DateSpoilage_RateTemperature_VariationEnergy_UsageStorage_DaysQuality_IndexProduct_Type_CleanStorage_ID_UpperStorage_ID_LowerLabelStorage_MonthsReview_DateUtilization_ClassFraud_Flag
1CS003Dairy16JAN202413FEB20245.00%0.75102891.5DairyCS003cs003CS003-Dai123448LowNO
2CS007Dairy05FEB202411MAR202423.00%3.12193561.5DairyCS007cs007CS007-Dai123477LowNO
3CS011Dairy25FEB202414APR20244.00%2.95004981.5DairyCS011cs011CS011-Dai223510HighNO
4CS015Dairy16MAR202429MAR202421.00%4.22391358.0DairyCS015cs015CS015-Dai023495LowYES
5CS019Dairy05APR202406MAY202421.00%4.52383156.5DairyCS019cs019CS019-Dai123533LowYES
6CS004Fruits21JAN202423FEB20247.00%5.84863364.0FruitsCS004cs004CS004-Fru123458HighYES
7CS008Fruits10FEB202404MAR20249.00%3.94352371.5FruitsCS008cs008CS008-Fru123470MediumNO
8CS012Fruits01MAR202408APR202424.00%4.52383853.5FruitsCS012cs012CS012-Fru123504LowYES
9CS016Fruits21MAR202428APR20246.00%1.55173886.5FruitsCS016cs016CS016-Fru123524HighNO
10CS020Fruits10APR202429APR20245.00%2.12421984.5FruitsCS020cs020CS020-Fru023525LowNO
11CS002Meat11JAN202406FEB20246.00%2.15152683.5MeatCS002cs002CS002-Mea123441LowNO
12CS006Meat31JAN202424FEB202419.00%3.94812461.5MeatCS006cs006CS006-Mea123459LowNO
13CS010Meat20FEB202428MAR20243.00%0.64333794.0MeatCS010cs010CS010-Mea123494MediumNO
14CS014Meat11MAR202412APR20248.00%4.85123268.0MeatCS014cs014CS014-Mea123508HighYES
15CS018Meat31MAR202406MAY202414.00%4.22303665.0MeatCS018cs018CS018-Mea223533LowYES
16CS001Vegetables06JAN202422FEB20248.00%4.52714769.5VegetablesCS001cs001CS001-Veg123457LowYES
17CS005Vegetables26JAN202411MAR202422.00%3.22574562.0VegetablesCS005cs005CS005-Veg223477LowNO
18CS009Vegetables15FEB202419MAR20247.00%3.24113377.0VegetablesCS009cs009CS009-Veg123485MediumNO
19CS013Vegetables06MAR202417APR20247.00%5.44644266.0VegetablesCS013cs013CS013-Veg123513HighYES
20CS017Vegetables26MAR202403MAY202421.00%1.63893871.0VegetablesCS017cs017CS017-Veg223530MediumNO

proc sort data=storage_summary;by Product_Type;run;

proc print data=storage_summary;

run;

OUTPUT:

ObsProduct_TypeUnitsAvg_SpoilageAvg_Energy
1Dairy514.80%341.2
2Dairy514.80%341.2
3Fruits510.20%383.6
4Fruits510.20%383.6
5Meat510.00%434.2
6Meat510.00%434.2
7Vegetables513.00%358.4
8Vegetables513.00%358.4

data merged_data;

    merge cold_storage_fraud 

           storage_summary;

    by Product_Type;

run;

proc print data=merged_data;

run;

OUTPUT:

ObsStorage_IDProduct_TypeStart_DateEnd_DateSpoilage_RateTemperature_VariationEnergy_UsageStorage_DaysQuality_IndexProduct_Type_CleanStorage_ID_UpperStorage_ID_LowerLabelStorage_MonthsReview_DateUtilization_ClassFraud_FlagUnitsAvg_SpoilageAvg_Energy
1CS003Dairy16JAN202413FEB20245.00%0.75102891.5DairyCS003cs003CS003-Dai123448LowNO514.80%341.2
2CS007Dairy05FEB202411MAR202423.00%3.12193561.5DairyCS007cs007CS007-Dai123477LowNO514.80%341.2
3CS011Dairy25FEB202414APR20244.00%2.95004981.5DairyCS011cs011CS011-Dai223510HighNO514.80%341.2
4CS015Dairy16MAR202429MAR202421.00%4.22391358.0DairyCS015cs015CS015-Dai023495LowYES514.80%341.2
5CS019Dairy05APR202406MAY202421.00%4.52383156.5DairyCS019cs019CS019-Dai123533LowYES514.80%341.2
6CS004Fruits21JAN202423FEB20247.00%5.84863364.0FruitsCS004cs004CS004-Fru123458HighYES510.20%383.6
7CS008Fruits10FEB202404MAR20249.00%3.94352371.5FruitsCS008cs008CS008-Fru123470MediumNO510.20%383.6
8CS012Fruits01MAR202408APR202424.00%4.52383853.5FruitsCS012cs012CS012-Fru123504LowYES510.20%383.6
9CS016Fruits21MAR202428APR20246.00%1.55173886.5FruitsCS016cs016CS016-Fru123524HighNO510.20%383.6
10CS020Fruits10APR202429APR20245.00%2.12421984.5FruitsCS020cs020CS020-Fru023525LowNO510.20%383.6
11CS002Meat11JAN202406FEB20246.00%2.15152683.5MeatCS002cs002CS002-Mea123441LowNO510.00%434.2
12CS006Meat31JAN202424FEB202419.00%3.94812461.5MeatCS006cs006CS006-Mea123459LowNO510.00%434.2
13CS010Meat20FEB202428MAR20243.00%0.64333794.0MeatCS010cs010CS010-Mea123494MediumNO510.00%434.2
14CS014Meat11MAR202412APR20248.00%4.85123268.0MeatCS014cs014CS014-Mea123508HighYES510.00%434.2
15CS018Meat31MAR202406MAY202414.00%4.22303665.0MeatCS018cs018CS018-Mea223533LowYES510.00%434.2
16CS001Vegetables06JAN202422FEB20248.00%4.52714769.5VegetablesCS001cs001CS001-Veg123457LowYES513.00%358.4
17CS005Vegetables26JAN202411MAR202422.00%3.22574562.0VegetablesCS005cs005CS005-Veg223477LowNO513.00%358.4
18CS009Vegetables15FEB202419MAR20247.00%3.24113377.0VegetablesCS009cs009CS009-Veg123485MediumNO513.00%358.4
19CS013Vegetables06MAR202417APR20247.00%5.44644266.0VegetablesCS013cs013CS013-Veg123513HighYES513.00%358.4
20CS017Vegetables26MAR202403MAY202421.00%1.63893871.0VegetablesCS017cs017CS017-Veg223530MediumNO513.00%358.4

19. PROC DATASETS DELETE

proc datasets library=work;

    delete cold_storage_raw;

quit;

LOG:

NOTE: Deleting WORK.COLD_STORAGE_RAW (memtype=DATA).

20. BUSINESS INSIGHTS

  • Temperature variation directly increases spoilage
  • Energy inefficiency correlates with longer storage
  • Fraud patterns appear in unrealistic combinations
  • Product type significantly affects quality retention

21. INTERVIEW VALUE

You can confidently explain:

  • Why macros were used
  • Why SQL was better than DATA step
  • How fraud logic mirrors real audits
  • How date intelligence supports scheduling
  • How plots communicate business risks

22. Conclusion

This project proves that:

·       Simple data can reveal powerful truths

·       SAS is not just programming—it’s decision support

·       Cold storage analytics protects money, quality, and trust

By combining:

·       Data steps

·       SQL

·       Macros

·       Statistics

·       Visualization


INTERVIEW QUESTIONS FOR YOU

·  What is the difference between LENGTH and FORMAT in SAS?

·  What is the difference between MISSING and 0 in SAS?

·  What are temporary and permanent SAS datasets?

 

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

About the Author:

SAS Learning Hub is a data analytics and SAS programming platform focused on clinical, financial, and real-world data analysis. The content is created by professionals with academic training in Pharmaceutics and hands-on experience in Base SAS, PROC SQL, Macros, SDTM, and ADaM, providing practical and industry-relevant SAS learning resources.


Disclaimer:

The datasets and analysis in this article are created for educational and demonstration purposes only. They do not represent COLD STORAGE data.


Our Mission:

This blog provides industry-focused SAS programming tutorials and analytics projects covering finance, healthcare, and technology.


This project is suitable for:

·  Students learning SAS

·  Data analysts building portfolios

·  Professionals preparing for SAS interviews

·  Bloggers writing about analytics and smart cities

·  EV and energy industry professionals

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Follow Us On : 


 
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--->Follow our blog for more SAS-based analytics projects and industry data models.

---> Support Us By Following Our Blog..

To deepen your understanding of SAS analytics, please refer to our other data science and industry-focused projects listed below:

1. Are We Losing Wildlife Faster Than We Think? – Animal Conservation Analysis Using SAS

2.Can Modern Art Data Explain What Actually Sells? – A Real-World SAS Analytics Project

3.Which Vehicles Truly Perform Better on Indian Roads? – A Real-World SAS Analytics Project

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

About Us | Contact Privacy Policy


Comments

Popular posts from this blog

379.CAN SAS DELIVER END-TO-END TLF (TABLES, LISTINGS, AND FIGURES) FOR INSURANCE CLAIMS ANALYSIS?

184.HOW TO CREATE MOCK SHELLS AND CLINICAL DATA LISTINGS IN SAS USING DATA NULL | PROC PRINT | PROC REPORT | PROC SORT | PROC COMPARE | ODS PDF | ODS RTF | ODS EXCEL | A COMPLETE STEP-BY-STEP GUIDE FOR CLINICAL SAS PROGRAMMERS

383.Which Pharma Distributors Are High-Risk? Can SAS Detect Delays, Temperature Violations, and Fraud?