165.MASTERING E-COMMERCE ORDER FULFILLMENT ANALYTICS USING SAS | FROM DATA CREATION WITH PROC DATA TO INSIGHTFUL REPORTING WITH PROC REPORT | LEVERAGING PROC CONTENTS | PROC PRINT | PROC SQL | PROC EXPORT | PROC MEANS | PROC FREQ | PROC SGPLOT | PROC REPORT | ENHANCED WITH MACROS FOR AUTOMATION

MASTERING E-COMMERCE ORDER FULFILLMENT ANALYTICS USING SAS | FROM DATA CREATION WITH PROC DATA TO INSIGHTFUL REPORTING WITH PROC REPORT | LEVERAGING PROC CONTENTS | PROC PRINT | PROC SQL | PROC EXPORT | PROC MEANS | PROC FREQ | PROC SGPLOT | PROC REPORT | ENHANCED WITH MACROS FOR AUTOMATION


 /*Creating a unique dataset centered around "E-commerce Order Fulfillment"*/

/*Dataset Overview: E-commerce Order Fulfillment*/

OrderID: Unique identifier for each order.

CustomerID: Identifier for the customer placing the order.

OrderDate: Date when the order was placed

ShipDate: Date when the order was shipped.

DeliveryDate: Date when the order was delivered.

ProductID: Identifier for the product ordered.

Quantity: Number of units ordered.

UnitPrice: Price per unit of the product.

ShippingCost: Cost incurred for shipping.

OrderStatus: Current status of the order (e.g., Delivered, Shipped, Pending).

Region: Geographic region of the delivery address.


/*Creating the Dataset*/

data ecommerce_orders;

    format OrderDate ShipDate DeliveryDate date9.;

    do OrderID = 1 to 15;

        CustomerID = ceil(ranuni(123)*500);

        OrderDate = '01JAN2025'd + ceil(ranuni(123)*90);

        ShipLag = ceil(ranuni(123)*5);

        DeliveryLag = ShipLag + ceil(ranuni(123)*5);

        ShipDate = OrderDate + ShipLag;

        DeliveryDate = ShipDate + DeliveryLag;

        ProductID = ceil(ranuni(123)*100);

        Quantity = ceil(ranuni(123)*10);

        UnitPrice = round(10 + ranuni(123)*90, 0.01);

        ShippingCost = round(5 + ranuni(123)*15, 0.01);

        StatusProb = ranuni(123);

        if StatusProb < 0.7 then OrderStatus = 'Delivered';

        else if StatusProb < 0.9 then OrderStatus = 'Shipped';

        else OrderStatus = 'Pending';

        RegionProb = ranuni(123);

        if RegionProb < 0.25 then Region = 'North';

        else if RegionProb < 0.5 then Region = 'South';

        else if RegionProb < 0.75 then Region = 'East';

        else Region = 'West';

        output;

    end;

    drop ShipLag DeliveryLag StatusProb RegionProb;

run;

proc print;run;

Output:

Obs OrderDate ShipDate DeliveryDate OrderID CustomerID ProductID Quantity UnitPrice ShippingCost OrderStatus Region
1 30JAN2025 31JAN2025 06FEB2025 1 376 36 3 80.78 10.97 Delivered North
2 10FEB2025 15FEB2025 22FEB2025 2 389 72 6 57.81 17.92 Delivered West
3 12MAR2025 16MAR2025 22MAR2025 3 327 53 10 14.18 14.22 Delivered South
4 30JAN2025 02FEB2025 07FEB2025 4 295 53 6 18.84 6.63 Delivered East
5 15JAN2025 19JAN2025 28JAN2025 5 242 76 10 18.47 15.16 Shipped South
6 14MAR2025 18MAR2025 24MAR2025 6 331 78 6 73.45 17.48 Delivered North
7 23JAN2025 27JAN2025 04FEB2025 7 127 26 9 52.60 15.30 Shipped East
8 27FEB2025 03MAR2025 12MAR2025 8 38 10 6 41.64 8.00 Delivered North
9 02JAN2025 04JAN2025 07JAN2025 9 500 85 7 11.20 10.14 Shipped North
10 29MAR2025 03APR2025 10APR2025 10 377 29 4 84.59 15.07 Delivered East
11 10MAR2025 15MAR2025 25MAR2025 11 437 65 9 94.00 8.15 Shipped South
12 17FEB2025 21FEB2025 28FEB2025 12 5 54 7 33.61 6.24 Delivered North
13 29MAR2025 02APR2025 11APR2025 13 259 64 5 81.69 13.53 Delivered North
14 28MAR2025 31MAR2025 06APR2025 14 203 67 2 45.11 11.89 Delivered East
15 28FEB2025 02MAR2025 08MAR2025 15 390 33 7 39.91 11.47 Delivered North


/*Exploring the Dataset*/

proc contents data=ecommerce_orders;

run;

Output:

                                                                        The CONTENTS Procedure

Data Set Name WORK.ECOMMERCE_ORDERS Observations 15
Member Type DATA Variables 11
Engine V9 Indexes 0
Created 14/09/2015 00:49:30 Observation Length 88
Last Modified 14/09/2015 00:49:30 Deleted Observations 0
Protection   Compressed NO
Data Set Type   Sorted NO
Label      
Data Representation WINDOWS_64    
Encoding wlatin1 Western (Windows)    

Engine/Host Dependent Information
Data Set Page Size 65536
Number of Data Set Pages 1
First Data Page 1
Max Obs per Page 743
Obs in First Data Page 15
Number of Data Set Repairs 0
ExtendObsCounter YES
Filename C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD8552_DESKTOP-QFAA4KV_\ecommerce_orders.sas7bdat
Release Created 9.0401M2
Host Created X64_8HOME

Alphabetic List of Variables and Attributes
# Variable Type Len Format
5 CustomerID Num 8  
3 DeliveryDate Num 8 DATE9.
1 OrderDate Num 8 DATE9.
4 OrderID Num 8  
10 OrderStatus Char 9  
6 ProductID Num 8  
7 Quantity Num 8  
11 Region Char 5  
2 ShipDate Num 8 DATE9.
9 ShippingCost Num 8  
8 UnitPrice Num 8  

proc print data=ecommerce_orders(obs=10);

run;

Output:

Obs OrderDate ShipDate DeliveryDate OrderID CustomerID ProductID Quantity UnitPrice ShippingCost OrderStatus Region
1 30JAN2025 31JAN2025 06FEB2025 1 376 36 3 80.78 10.97 Delivered North
2 10FEB2025 15FEB2025 22FEB2025 2 389 72 6 57.81 17.92 Delivered West
3 12MAR2025 16MAR2025 22MAR2025 3 327 53 10 14.18 14.22 Delivered South
4 30JAN2025 02FEB2025 07FEB2025 4 295 53 6 18.84 6.63 Delivered East
5 15JAN2025 19JAN2025 28JAN2025 5 242 76 10 18.47 15.16 Shipped South
6 14MAR2025 18MAR2025 24MAR2025 6 331 78 6 73.45 17.48 Delivered North
7 23JAN2025 27JAN2025 04FEB2025 7 127 26 9 52.60 15.30 Shipped East
8 27FEB2025 03MAR2025 12MAR2025 8 38 10 6 41.64 8.00 Delivered North
9 02JAN2025 04JAN2025 07JAN2025 9 500 85 7 11.20 10.14 Shipped North
10 29MAR2025 03APR2025 10APR2025 10 377 29 4 84.59 15.07 Delivered East

/*Total Sales per Region*/

proc sql;

    select Region,

           sum(Quantity * UnitPrice) as TotalSales format=dollar12.2

    from ecommerce_orders

    group by Region;

quit;

Output:

Region TotalSales
East $1,015.02
North $1,934.37
South $1,172.50
West $346.86

/*Average Delivery Time per Region*/

proc sql;

    select Region,

           avg(DeliveryDate - OrderDate) as AvgDeliveryTime format=8.2

    from ecommerce_orders

    where OrderStatus = 'Delivered'

    group by Region;

quit;

Output:

Region AvgDeliveryTime
East 9.67
North 10.33
South 10.00
West 12.00

/*Creating Macro Variables with PROC SQL*/

proc sql noprint;

    select count(*) into :TotalOrders

    from ecommerce_orders;


    select sum(Quantity * UnitPrice) into :TotalSales

    from ecommerce_orders;

quit;


%put Total Orders: &TotalOrders;

Log:

Total Orders:       15

%put Total Sales: &TotalSales;

Log:

Total Sales:  4468.75

/*Automating Reports with Macros*/

%macro RegionReport(region);

    title "E-commerce Report for &region Region";

    proc sql;

        select OrderStatus,

               count(*) as OrderCount,

               sum(Quantity * UnitPrice) as TotalSales format=dollar12.2

        from ecommerce_orders

        where Region = "&region"

        group by OrderStatus;

    quit;

%mend;


%RegionReport(North);

Output:

                                                    E-commerce Report for North Region

OrderStatus OrderCount TotalSales
Delivered 6 $1,855.97
Shipped 1 $78.40

%RegionReport(South);

Output:

                                                      E-commerce Report for South Region

OrderStatus OrderCount TotalSales
Delivered 1 $141.80
Shipped 2 $1,030.70

%RegionReport(East);

Output:

                                                     E-commerce Report for East Region

OrderStatus OrderCount TotalSales
Delivered 3 $541.62
Shipped 1 $473.40

%RegionReport(West);

Output:

                                                    E-commerce Report for West Region

OrderStatus OrderCount TotalSales
Delivered 1 $346.86
 

/*Orders Over Time*/

proc sql;

    create table orders_over_time as

    select OrderDate, count(*) as Orders

    from ecommerce_orders

    group by OrderDate;

quit;

proc print;run;

Output:

Obs OrderDate Orders
1 02JAN2025 1
2 15JAN2025 1
3 23JAN2025 1
4 30JAN2025 2
5 10FEB2025 1
6 17FEB2025 1
7 27FEB2025 1
8 28FEB2025 1
9 10MAR2025 1
10 12MAR2025 1
11 14MAR2025 1
12 28MAR2025 1
13 29MAR2025 2


proc sgplot data=orders_over_time;

    series x=OrderDate y=Orders;

    title "Daily Orders Over Time";

run;


/*Sales by Region*/

proc sql;

    create table sales_by_region as

    select Region, sum(Quantity * UnitPrice) as TotalSales

    from ecommerce_orders

    group by Region;

quit;

proc print;run;

Output:

Obs Region TotalSales
1 East 1015.02
2 North 1934.37
3 South 1172.50
4 West 346.86


proc sgplot data=sales_by_region;

    vbar Region / response=TotalSales stat=sum;

    title "Total Sales by Region";

run;


/*Identifying Top 5 Products by Sales*/

proc sql;

    create table top_products as

    select ProductID,

           sum(Quantity * UnitPrice) as ProductSales

    from ecommerce_orders

    group by ProductID

    order by ProductSales desc;

quit;

proc print data=top_products(obs=5);

    title "Top 5 Products by Sales";

run;

Output:

                                                                Top 5 Products by Sales

Obs ProductID ProductSales
1 65 846.00
2 26 473.40
3 78 440.70
4 64 408.45
5 72 346.86

/*Delivery Time Analysis*/

data delivery_analysis;

    set ecommerce_orders;

    DeliveryTime = DeliveryDate - OrderDate;

run;


proc means data=delivery_analysis mean std min max;

    var DeliveryTime;

    class Region;

    title "Delivery Time Statistics by Region";

run;

Output:

                                                     Delivery Time Statistics by Region

                                                           The MEANS Procedure

Analysis Variable : DeliveryTime
Region N Obs Mean Std Dev Minimum Maximum
East 4 10.2500000 2.0615528 8.0000000 12.0000000
North 7 9.5714286 3.0472470 5.0000000 13.0000000
South 3 12.6666667 2.5166115 10.0000000 15.0000000
West 1 12.0000000 . 12.0000000 12.0000000

/*Generating a Summary Report*/

proc report data=ecommerce_orders nowd;

    column Region OrderStatus Quantity UnitPrice TotalSales;

    define Region / group;

    define OrderStatus / group;

    define Quantity / analysis sum format=comma12. 'Total Quantity';

    define UnitPrice / analysis mean format=dollar12.2 'Average Unit Price';

    define TotalSales / computed format=dollar12.2 'Total Sales';


    compute TotalSales;

        TotalSales = Quantity.sum * UnitPrice.mean;

    endcomp;


    title "Summary Report by Region and Order Status";

run;rt by Region and Order Status";

run;

Output:

                                         Summary Report by Region and Order Status

Region OrderStatus Total Quantity Average Unit Price Total Sales
East Delivered 12 $49.51 $594.16
  Shipped 9 $52.60 $473.40
North Delivered 34 $58.51 $1,989.45
  Shipped 7 $11.20 $78.40
South Delivered 10 $14.18 $141.80
  Shipped 19 $56.24 $1,068.47
West Delivered 6 $57.81 $346.86

/*Checking for Missing Values*/

proc means data=ecommerce_orders n nmiss;

    var Quantity UnitPrice ShippingCost;

    title "Missing Values Check";

run;

Output:

                                                                 Missing Values Check

                                                                 The MEANS Procedure

Variable N N Miss
Quantity
UnitPrice
ShippingCost
15
15
15
0
0
0

/*Validating Date Sequences*/

data date_validation;

    set ecommerce_orders;

    if ShipDate < OrderDate or DeliveryDate < ShipDate then Flag = 'Invalid Dates';

    else Flag = 'Valid';

run;


proc freq data=date_validation;

    tables Flag;

    title "Date Validation Results";

run;

Output:

                                                             Date Validation Results

                                                               The FREQ Procedure

Flag Frequency Percent Cumulative
Frequency
Cumulative
Percent
Valid 15 100.00 15 100.00

/*Exporting the Cleaned Dataset*/

proc export data=ecommerce_orders

    outfile='ecommerce_orders_cleaned.csv'

    dbms=csv

    replace;

run;


PRACTICE AND COMMENT YOUR CODE: 

-->PLEASE FOLLOW OUR BLOG FOR MORE UPDATES.

TO FOLLOW OUR TELEGRAM CHANNEL CLICK HERE


Comments