165.MASTERING E-COMMERCE ORDER FULFILLMENT ANALYTICS USING SAS | FROM DATA CREATION WITH PROC DATA TO INSIGHTFUL REPORTING WITH PROC REPORT | LEVERAGING PROC CONTENTS | PROC PRINT | PROC SQL | PROC EXPORT | PROC MEANS | PROC FREQ | PROC SGPLOT | PROC REPORT | ENHANCED WITH MACROS FOR AUTOMATION
MASTERING E-COMMERCE ORDER FULFILLMENT ANALYTICS USING SAS | FROM DATA CREATION WITH PROC DATA TO INSIGHTFUL REPORTING WITH PROC REPORT | LEVERAGING PROC CONTENTS | PROC PRINT | PROC SQL | PROC EXPORT | PROC MEANS | PROC FREQ | PROC SGPLOT | PROC REPORT | ENHANCED WITH MACROS FOR AUTOMATION
/*Creating a unique dataset centered around "E-commerce Order Fulfillment"*/
/*Dataset Overview: E-commerce Order Fulfillment*/
OrderID: Unique identifier for each order.
CustomerID: Identifier for the customer placing the order.
OrderDate: Date when the order was placed
ShipDate: Date when the order was shipped.
DeliveryDate: Date when the order was delivered.
ProductID: Identifier for the product ordered.
Quantity: Number of units ordered.
UnitPrice: Price per unit of the product.
ShippingCost: Cost incurred for shipping.
OrderStatus: Current status of the order (e.g., Delivered, Shipped, Pending).
Region: Geographic region of the delivery address.
/*Creating the Dataset*/
data ecommerce_orders;
format OrderDate ShipDate DeliveryDate date9.;
do OrderID = 1 to 15;
CustomerID = ceil(ranuni(123)*500);
OrderDate = '01JAN2025'd + ceil(ranuni(123)*90);
ShipLag = ceil(ranuni(123)*5);
DeliveryLag = ShipLag + ceil(ranuni(123)*5);
ShipDate = OrderDate + ShipLag;
DeliveryDate = ShipDate + DeliveryLag;
ProductID = ceil(ranuni(123)*100);
Quantity = ceil(ranuni(123)*10);
UnitPrice = round(10 + ranuni(123)*90, 0.01);
ShippingCost = round(5 + ranuni(123)*15, 0.01);
StatusProb = ranuni(123);
if StatusProb < 0.7 then OrderStatus = 'Delivered';
else if StatusProb < 0.9 then OrderStatus = 'Shipped';
else OrderStatus = 'Pending';
RegionProb = ranuni(123);
if RegionProb < 0.25 then Region = 'North';
else if RegionProb < 0.5 then Region = 'South';
else if RegionProb < 0.75 then Region = 'East';
else Region = 'West';
output;
end;
drop ShipLag DeliveryLag StatusProb RegionProb;
run;
proc print;run;
Output:
Obs | OrderDate | ShipDate | DeliveryDate | OrderID | CustomerID | ProductID | Quantity | UnitPrice | ShippingCost | OrderStatus | Region |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 30JAN2025 | 31JAN2025 | 06FEB2025 | 1 | 376 | 36 | 3 | 80.78 | 10.97 | Delivered | North |
2 | 10FEB2025 | 15FEB2025 | 22FEB2025 | 2 | 389 | 72 | 6 | 57.81 | 17.92 | Delivered | West |
3 | 12MAR2025 | 16MAR2025 | 22MAR2025 | 3 | 327 | 53 | 10 | 14.18 | 14.22 | Delivered | South |
4 | 30JAN2025 | 02FEB2025 | 07FEB2025 | 4 | 295 | 53 | 6 | 18.84 | 6.63 | Delivered | East |
5 | 15JAN2025 | 19JAN2025 | 28JAN2025 | 5 | 242 | 76 | 10 | 18.47 | 15.16 | Shipped | South |
6 | 14MAR2025 | 18MAR2025 | 24MAR2025 | 6 | 331 | 78 | 6 | 73.45 | 17.48 | Delivered | North |
7 | 23JAN2025 | 27JAN2025 | 04FEB2025 | 7 | 127 | 26 | 9 | 52.60 | 15.30 | Shipped | East |
8 | 27FEB2025 | 03MAR2025 | 12MAR2025 | 8 | 38 | 10 | 6 | 41.64 | 8.00 | Delivered | North |
9 | 02JAN2025 | 04JAN2025 | 07JAN2025 | 9 | 500 | 85 | 7 | 11.20 | 10.14 | Shipped | North |
10 | 29MAR2025 | 03APR2025 | 10APR2025 | 10 | 377 | 29 | 4 | 84.59 | 15.07 | Delivered | East |
11 | 10MAR2025 | 15MAR2025 | 25MAR2025 | 11 | 437 | 65 | 9 | 94.00 | 8.15 | Shipped | South |
12 | 17FEB2025 | 21FEB2025 | 28FEB2025 | 12 | 5 | 54 | 7 | 33.61 | 6.24 | Delivered | North |
13 | 29MAR2025 | 02APR2025 | 11APR2025 | 13 | 259 | 64 | 5 | 81.69 | 13.53 | Delivered | North |
14 | 28MAR2025 | 31MAR2025 | 06APR2025 | 14 | 203 | 67 | 2 | 45.11 | 11.89 | Delivered | East |
15 | 28FEB2025 | 02MAR2025 | 08MAR2025 | 15 | 390 | 33 | 7 | 39.91 | 11.47 | Delivered | North |
/*Exploring the Dataset*/
proc contents data=ecommerce_orders;
run;
Output:
Data Set Name | WORK.ECOMMERCE_ORDERS | Observations | 15 |
---|---|---|---|
Member Type | DATA | Variables | 11 |
Engine | V9 | Indexes | 0 |
Created | 14/09/2015 00:49:30 | Observation Length | 88 |
Last Modified | 14/09/2015 00:49:30 | Deleted Observations | 0 |
Protection | Compressed | NO | |
Data Set Type | Sorted | NO | |
Label | |||
Data Representation | WINDOWS_64 | ||
Encoding | wlatin1 Western (Windows) |
Engine/Host Dependent Information | |
---|---|
Data Set Page Size | 65536 |
Number of Data Set Pages | 1 |
First Data Page | 1 |
Max Obs per Page | 743 |
Obs in First Data Page | 15 |
Number of Data Set Repairs | 0 |
ExtendObsCounter | YES |
Filename | C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD8552_DESKTOP-QFAA4KV_\ecommerce_orders.sas7bdat |
Release Created | 9.0401M2 |
Host Created | X64_8HOME |
Alphabetic List of Variables and Attributes | ||||
---|---|---|---|---|
# | Variable | Type | Len | Format |
5 | CustomerID | Num | 8 | |
3 | DeliveryDate | Num | 8 | DATE9. |
1 | OrderDate | Num | 8 | DATE9. |
4 | OrderID | Num | 8 | |
10 | OrderStatus | Char | 9 | |
6 | ProductID | Num | 8 | |
7 | Quantity | Num | 8 | |
11 | Region | Char | 5 | |
2 | ShipDate | Num | 8 | DATE9. |
9 | ShippingCost | Num | 8 | |
8 | UnitPrice | Num | 8 |
proc print data=ecommerce_orders(obs=10);
run;
Output:
Obs | OrderDate | ShipDate | DeliveryDate | OrderID | CustomerID | ProductID | Quantity | UnitPrice | ShippingCost | OrderStatus | Region |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 30JAN2025 | 31JAN2025 | 06FEB2025 | 1 | 376 | 36 | 3 | 80.78 | 10.97 | Delivered | North |
2 | 10FEB2025 | 15FEB2025 | 22FEB2025 | 2 | 389 | 72 | 6 | 57.81 | 17.92 | Delivered | West |
3 | 12MAR2025 | 16MAR2025 | 22MAR2025 | 3 | 327 | 53 | 10 | 14.18 | 14.22 | Delivered | South |
4 | 30JAN2025 | 02FEB2025 | 07FEB2025 | 4 | 295 | 53 | 6 | 18.84 | 6.63 | Delivered | East |
5 | 15JAN2025 | 19JAN2025 | 28JAN2025 | 5 | 242 | 76 | 10 | 18.47 | 15.16 | Shipped | South |
6 | 14MAR2025 | 18MAR2025 | 24MAR2025 | 6 | 331 | 78 | 6 | 73.45 | 17.48 | Delivered | North |
7 | 23JAN2025 | 27JAN2025 | 04FEB2025 | 7 | 127 | 26 | 9 | 52.60 | 15.30 | Shipped | East |
8 | 27FEB2025 | 03MAR2025 | 12MAR2025 | 8 | 38 | 10 | 6 | 41.64 | 8.00 | Delivered | North |
9 | 02JAN2025 | 04JAN2025 | 07JAN2025 | 9 | 500 | 85 | 7 | 11.20 | 10.14 | Shipped | North |
10 | 29MAR2025 | 03APR2025 | 10APR2025 | 10 | 377 | 29 | 4 | 84.59 | 15.07 | Delivered | East |
/*Total Sales per Region*/
proc sql;
select Region,
sum(Quantity * UnitPrice) as TotalSales format=dollar12.2
from ecommerce_orders
group by Region;
quit;
Output:
Region | TotalSales |
---|---|
East | $1,015.02 |
North | $1,934.37 |
South | $1,172.50 |
West | $346.86 |
/*Average Delivery Time per Region*/
proc sql;
select Region,
avg(DeliveryDate - OrderDate) as AvgDeliveryTime format=8.2
from ecommerce_orders
where OrderStatus = 'Delivered'
group by Region;
quit;
Output:
Region | AvgDeliveryTime |
---|---|
East | 9.67 |
North | 10.33 |
South | 10.00 |
West | 12.00 |
/*Creating Macro Variables with PROC SQL*/
proc sql noprint;
select count(*) into :TotalOrders
from ecommerce_orders;
select sum(Quantity * UnitPrice) into :TotalSales
from ecommerce_orders;
quit;
%put Total Orders: &TotalOrders;
Log:
%put Total Sales: &TotalSales;
Log:
/*Automating Reports with Macros*/
%macro RegionReport(region);
title "E-commerce Report for ®ion Region";
proc sql;
select OrderStatus,
count(*) as OrderCount,
sum(Quantity * UnitPrice) as TotalSales format=dollar12.2
from ecommerce_orders
where Region = "®ion"
group by OrderStatus;
quit;
%mend;
%RegionReport(North);
Output:
E-commerce Report for North Region |
OrderStatus | OrderCount | TotalSales |
---|---|---|
Delivered | 6 | $1,855.97 |
Shipped | 1 | $78.40 |
%RegionReport(South);
Output:
E-commerce Report for South Region |
OrderStatus | OrderCount | TotalSales |
---|---|---|
Delivered | 1 | $141.80 |
Shipped | 2 | $1,030.70 |
%RegionReport(East);
Output:
E-commerce Report for East Region |
OrderStatus | OrderCount | TotalSales |
---|---|---|
Delivered | 3 | $541.62 |
Shipped | 1 | $473.40 |
%RegionReport(West);
Output:
E-commerce Report for West Region |
OrderStatus | OrderCount | TotalSales |
---|---|---|
Delivered | 1 | $346.86 |
/*Orders Over Time*/
proc sql;
create table orders_over_time as
select OrderDate, count(*) as Orders
from ecommerce_orders
group by OrderDate;
quit;
proc print;run;
Output:
Obs | OrderDate | Orders |
---|---|---|
1 | 02JAN2025 | 1 |
2 | 15JAN2025 | 1 |
3 | 23JAN2025 | 1 |
4 | 30JAN2025 | 2 |
5 | 10FEB2025 | 1 |
6 | 17FEB2025 | 1 |
7 | 27FEB2025 | 1 |
8 | 28FEB2025 | 1 |
9 | 10MAR2025 | 1 |
10 | 12MAR2025 | 1 |
11 | 14MAR2025 | 1 |
12 | 28MAR2025 | 1 |
13 | 29MAR2025 | 2 |
proc sgplot data=orders_over_time;
series x=OrderDate y=Orders;
title "Daily Orders Over Time";
run;
/*Sales by Region*/
proc sql;
create table sales_by_region as
select Region, sum(Quantity * UnitPrice) as TotalSales
from ecommerce_orders
group by Region;
quit;
proc print;run;
Output:
Obs | Region | TotalSales |
---|---|---|
1 | East | 1015.02 |
2 | North | 1934.37 |
3 | South | 1172.50 |
4 | West | 346.86 |
proc sgplot data=sales_by_region;
vbar Region / response=TotalSales stat=sum;
title "Total Sales by Region";
run;
/*Identifying Top 5 Products by Sales*/
proc sql;
create table top_products as
select ProductID,
sum(Quantity * UnitPrice) as ProductSales
from ecommerce_orders
group by ProductID
order by ProductSales desc;
quit;
proc print data=top_products(obs=5);
title "Top 5 Products by Sales";
run;
Output:
Top 5 Products by Sales |
Obs | ProductID | ProductSales |
---|---|---|
1 | 65 | 846.00 |
2 | 26 | 473.40 |
3 | 78 | 440.70 |
4 | 64 | 408.45 |
5 | 72 | 346.86 |
/*Delivery Time Analysis*/
data delivery_analysis;
set ecommerce_orders;
DeliveryTime = DeliveryDate - OrderDate;
run;
proc means data=delivery_analysis mean std min max;
var DeliveryTime;
class Region;
title "Delivery Time Statistics by Region";
run;
Output:
Delivery Time Statistics by Region |
Analysis Variable : DeliveryTime | |||||
---|---|---|---|---|---|
Region | N Obs | Mean | Std Dev | Minimum | Maximum |
East | 4 | 10.2500000 | 2.0615528 | 8.0000000 | 12.0000000 |
North | 7 | 9.5714286 | 3.0472470 | 5.0000000 | 13.0000000 |
South | 3 | 12.6666667 | 2.5166115 | 10.0000000 | 15.0000000 |
West | 1 | 12.0000000 | . | 12.0000000 | 12.0000000 |
/*Generating a Summary Report*/
proc report data=ecommerce_orders nowd;
column Region OrderStatus Quantity UnitPrice TotalSales;
define Region / group;
define OrderStatus / group;
define Quantity / analysis sum format=comma12. 'Total Quantity';
define UnitPrice / analysis mean format=dollar12.2 'Average Unit Price';
define TotalSales / computed format=dollar12.2 'Total Sales';
compute TotalSales;
TotalSales = Quantity.sum * UnitPrice.mean;
endcomp;
title "Summary Report by Region and Order Status";
run;rt by Region and Order Status";
run;
Output:
Summary Report by Region and Order Status |
Region | OrderStatus | Total Quantity | Average Unit Price | Total Sales |
---|---|---|---|---|
East | Delivered | 12 | $49.51 | $594.16 |
Shipped | 9 | $52.60 | $473.40 | |
North | Delivered | 34 | $58.51 | $1,989.45 |
Shipped | 7 | $11.20 | $78.40 | |
South | Delivered | 10 | $14.18 | $141.80 |
Shipped | 19 | $56.24 | $1,068.47 | |
West | Delivered | 6 | $57.81 | $346.86 |
/*Checking for Missing Values*/
proc means data=ecommerce_orders n nmiss;
var Quantity UnitPrice ShippingCost;
title "Missing Values Check";
run;
Output:
Missing Values Check |
Variable | N | N Miss | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
/*Validating Date Sequences*/
data date_validation;
set ecommerce_orders;
if ShipDate < OrderDate or DeliveryDate < ShipDate then Flag = 'Invalid Dates';
else Flag = 'Valid';
run;
proc freq data=date_validation;
tables Flag;
title "Date Validation Results";
run;
Output:
Date Validation Results |
Flag | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
Valid | 15 | 100.00 | 15 | 100.00 |
/*Exporting the Cleaned Dataset*/
proc export data=ecommerce_orders
outfile='ecommerce_orders_cleaned.csv'
dbms=csv
replace;
run;
Comments
Post a Comment