168.COMPREHENSIVE ONLINE RETAIL ANALYSIS USING SAS | DATA GENERATION WITH DATA STEP | DATA EXPLORATION USING PROC CONTENTS | PROC PRINT | PROC MEANS | DATA TRANSFORMATION VIA DATA STEP | ADVANCED ANALYSIS WITH PROC SQL | AUTOMATION THROUGH MACROS | VISUALIZATION USING PROC SGPLOT | REPORTING WITH PROC REPORT | A STEP-BY-STEP GUIDE TO MASTERING SAS FOR ONLINE RETAIL DATA
- Get link
- X
- Other Apps
COMPREHENSIVE ONLINE RETAIL ANALYSIS USING SAS | DATA GENERATION WITH DATA STEP | DATA EXPLORATION USING PROC CONTENTS | PROC PRINT | PROC MEANS | DATA TRANSFORMATION VIA DATA STEP | ADVANCED ANALYSIS WITH PROC SQL | AUTOMATION THROUGH MACROS | VISUALIZATION USING PROC SGPLOT | REPORTING WITH PROC REPORT | A STEP-BY-STEP GUIDE TO MASTERING SAS FOR ONLINE RETAIL DATA
/*Creating a unique dataset related to Online Retail Transactions*/
/*Dataset Overview: Online Retail Transactions*/
We'll simulate an online retail dataset capturing customer purchases over a year. The dataset will include:
Transaction_ID: Unique identifier for each transaction.
Customer_ID: Unique identifier for each customer.
Transaction_Date: Date of the transaction.
Product_ID: Unique identifier for each product.
Product_Category: Category of the product (e.g., Electronics, Clothing).
Quantity: Number of units purchased.
Unit_Price: Price per unit of the product.
Total_Amount: Total amount for the transaction (Quantity × Unit_Price).
Payment_Method: Method of payment used (e.g., Credit Card, PayPal).
/*Data Creation*/
data online_retail;
call streaminit(123); /* For reproducibility */
format Transaction_Date date9.;
array categories[5] $20 _temporary_ ('Electronics', 'Clothing', 'Home & Kitchen', 'Books', 'Sports');
array payments[4] $20 _temporary_ ('Credit Card', 'PayPal', 'Debit Card', 'Net Banking');
do Transaction_ID = 1 to 20;
Customer_ID = ceil(ranuni(0) * 20); /* 20 unique customers */
Transaction_Date = '01JAN2024'd + ceil(ranuni(0) * 364);
Product_ID = ceil(ranuni(0) * 500); /* 500 unique products */
Product_Category = categories[ceil(ranuni(0) * dim(categories))];
Quantity = ceil(ranuni(0) * 5);
Unit_Price = round(10 + ranuni(0) * 90, 0.01); /* Prices between $10 and $100 */
Total_Amount = Quantity * Unit_Price;
Payment_Method = payments[ceil(ranuni(0) * dim(payments))];
output;
end;
run;
proc print;run;
Output:
Obs | Transaction_Date | Transaction_ID | Customer_ID | Product_ID | Product_Category | Quantity | Unit_Price | Total_Amount | Payment_Method |
---|---|---|---|---|---|---|---|---|---|
1 | 26DEC2024 | 1 | 14 | 462 | Books | 5 | 64.27 | 321.35 | Debit Card |
2 | 11AUG2024 | 2 | 16 | 273 | Home & Kitchen | 4 | 17.75 | 71.00 | Credit Card |
3 | 02DEC2024 | 3 | 11 | 339 | Home & Kitchen | 4 | 53.63 | 214.52 | PayPal |
4 | 25FEB2024 | 4 | 13 | 349 | Sports | 4 | 72.42 | 289.68 | Debit Card |
5 | 02APR2024 | 5 | 1 | 85 | Books | 4 | 65.45 | 261.80 | Net Banking |
6 | 18JAN2024 | 6 | 3 | 433 | Electronics | 2 | 90.40 | 180.80 | Debit Card |
7 | 03JUN2024 | 7 | 7 | 463 | Sports | 1 | 25.71 | 25.71 | PayPal |
8 | 25OCT2024 | 8 | 7 | 261 | Books | 1 | 50.36 | 50.36 | PayPal |
9 | 30JUL2024 | 9 | 15 | 46 | Sports | 5 | 93.59 | 467.95 | Credit Card |
10 | 25JUN2024 | 10 | 8 | 127 | Clothing | 5 | 53.01 | 265.05 | Debit Card |
11 | 19MAR2024 | 11 | 2 | 466 | Sports | 1 | 96.46 | 96.46 | Debit Card |
12 | 14JUL2024 | 12 | 4 | 479 | Clothing | 1 | 49.41 | 49.41 | Debit Card |
13 | 16FEB2024 | 13 | 18 | 127 | Electronics | 3 | 31.34 | 94.02 | PayPal |
14 | 09JUL2024 | 14 | 17 | 362 | Electronics | 1 | 13.35 | 13.35 | PayPal |
15 | 10JUL2024 | 15 | 14 | 285 | Clothing | 2 | 18.59 | 37.18 | Net Banking |
16 | 21NOV2024 | 16 | 13 | 218 | Clothing | 1 | 70.01 | 70.01 | PayPal |
17 | 07MAR2024 | 17 | 3 | 85 | Electronics | 4 | 86.80 | 347.20 | PayPal |
18 | 13DEC2024 | 18 | 17 | 72 | Sports | 5 | 15.32 | 76.60 | Net Banking |
19 | 18MAR2024 | 19 | 5 | 213 | Sports | 5 | 19.52 | 97.60 | Debit Card |
20 | 18DEC2024 | 20 | 11 | 128 | Sports | 4 | 14.64 | 58.56 | Net Banking |
/*View Dataset Structure*/
proc contents data=online_retail;
run;
Output:
Data Set Name | WORK.ONLINE_RETAIL | Observations | 20 |
---|---|---|---|
Member Type | DATA | Variables | 9 |
Engine | V9 | Indexes | 0 |
Created | 14/09/2015 00:13:14 | Observation Length | 96 |
Last Modified | 14/09/2015 00:13:14 | Deleted Observations | 0 |
Protection | Compressed | NO | |
Data Set Type | Sorted | NO | |
Label | |||
Data Representation | WINDOWS_64 | ||
Encoding | wlatin1 Western (Windows) |
Engine/Host Dependent Information | |
---|---|
Data Set Page Size | 65536 |
Number of Data Set Pages | 1 |
First Data Page | 1 |
Max Obs per Page | 681 |
Obs in First Data Page | 20 |
Number of Data Set Repairs | 0 |
ExtendObsCounter | YES |
Filename | C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD10624_DESKTOP-QFAA4KV_\online_retail.sas7bdat |
Release Created | 9.0401M2 |
Host Created | X64_8HOME |
Alphabetic List of Variables and Attributes | ||||
---|---|---|---|---|
# | Variable | Type | Len | Format |
3 | Customer_ID | Num | 8 | |
9 | Payment_Method | Char | 20 | |
5 | Product_Category | Char | 20 | |
4 | Product_ID | Num | 8 | |
6 | Quantity | Num | 8 | |
8 | Total_Amount | Num | 8 | |
1 | Transaction_Date | Num | 8 | DATE9. |
2 | Transaction_ID | Num | 8 | |
7 | Unit_Price | Num | 8 |
proc print data=online_retail(obs=10);
run;
Output:
Obs | Transaction_Date | Transaction_ID | Customer_ID | Product_ID | Product_Category | Quantity | Unit_Price | Total_Amount | Payment_Method |
---|---|---|---|---|---|---|---|---|---|
1 | 26DEC2024 | 1 | 14 | 462 | Books | 5 | 64.27 | 321.35 | Debit Card |
2 | 11AUG2024 | 2 | 16 | 273 | Home & Kitchen | 4 | 17.75 | 71.00 | Credit Card |
3 | 02DEC2024 | 3 | 11 | 339 | Home & Kitchen | 4 | 53.63 | 214.52 | PayPal |
4 | 25FEB2024 | 4 | 13 | 349 | Sports | 4 | 72.42 | 289.68 | Debit Card |
5 | 02APR2024 | 5 | 1 | 85 | Books | 4 | 65.45 | 261.80 | Net Banking |
6 | 18JAN2024 | 6 | 3 | 433 | Electronics | 2 | 90.40 | 180.80 | Debit Card |
7 | 03JUN2024 | 7 | 7 | 463 | Sports | 1 | 25.71 | 25.71 | PayPal |
8 | 25OCT2024 | 8 | 7 | 261 | Books | 1 | 50.36 | 50.36 | PayPal |
9 | 30JUL2024 | 9 | 15 | 46 | Sports | 5 | 93.59 | 467.95 | Credit Card |
10 | 25JUN2024 | 10 | 8 | 127 | Clothing | 5 | 53.01 | 265.05 | Debit Card |
/*Summary Statistics*/
proc means data=online_retail n mean std min max;
var Quantity Unit_Price Total_Amount;
run;
Output:
Variable | N | Mean | Std Dev | Minimum | Maximum | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
proc sql;
select Product_Category,
sum(Total_Amount) as Total_Sales format=dollar12.2
from online_retail
group by Product_Category
order by Total_Sales desc;
quit;
Output:
Product_Category | Total_Sales |
---|---|
Sports | $1,112.56 |
Electronics | $635.37 |
Books | $633.51 |
Clothing | $421.65 |
Home & Kitchen | $285.52 |
/*Top 5 Customers by Total Spend*/
proc sql outobs=5;
select Customer_ID,
sum(Total_Amount) as Total_Spend format=dollar12.2
from online_retail
group by Customer_ID
order by Total_Spend desc;
quit;
Output:
Customer_ID | Total_Spend |
---|---|
3 | $528.00 |
15 | $467.95 |
13 | $359.69 |
14 | $358.53 |
11 | $273.08 |
/*Monthly Sales Trend*/
proc sql;
select month(Transaction_Date) as Month,
sum(Total_Amount) as Monthly_Sales format=dollar12.2
from online_retail
group by calculated Month
order by calculated Month;
quit;
Output:
Month | Monthly_Sales |
---|---|
1 | $180.80 |
2 | $383.70 |
3 | $541.26 |
4 | $261.80 |
6 | $290.76 |
7 | $567.89 |
8 | $71.00 |
10 | $50.36 |
11 | $70.01 |
12 | $671.03 |
/*Creating a New Variable: Average Price per Unit*/
data online_retail;
set online_retail;
Avg_Price_Per_Unit = Total_Amount / Quantity;
run;
proc print;run;
Output:
Obs | Transaction_Date | Transaction_ID | Customer_ID | Product_ID | Product_Category | Quantity | Unit_Price | Total_Amount | Payment_Method | Avg_Price_Per_Unit |
---|---|---|---|---|---|---|---|---|---|---|
1 | 26DEC2024 | 1 | 14 | 462 | Books | 5 | 64.27 | 321.35 | Debit Card | 64.27 |
2 | 11AUG2024 | 2 | 16 | 273 | Home & Kitchen | 4 | 17.75 | 71.00 | Credit Card | 17.75 |
3 | 02DEC2024 | 3 | 11 | 339 | Home & Kitchen | 4 | 53.63 | 214.52 | PayPal | 53.63 |
4 | 25FEB2024 | 4 | 13 | 349 | Sports | 4 | 72.42 | 289.68 | Debit Card | 72.42 |
5 | 02APR2024 | 5 | 1 | 85 | Books | 4 | 65.45 | 261.80 | Net Banking | 65.45 |
6 | 18JAN2024 | 6 | 3 | 433 | Electronics | 2 | 90.40 | 180.80 | Debit Card | 90.40 |
7 | 03JUN2024 | 7 | 7 | 463 | Sports | 1 | 25.71 | 25.71 | PayPal | 25.71 |
8 | 25OCT2024 | 8 | 7 | 261 | Books | 1 | 50.36 | 50.36 | PayPal | 50.36 |
9 | 30JUL2024 | 9 | 15 | 46 | Sports | 5 | 93.59 | 467.95 | Credit Card | 93.59 |
10 | 25JUN2024 | 10 | 8 | 127 | Clothing | 5 | 53.01 | 265.05 | Debit Card | 53.01 |
11 | 19MAR2024 | 11 | 2 | 466 | Sports | 1 | 96.46 | 96.46 | Debit Card | 96.46 |
12 | 14JUL2024 | 12 | 4 | 479 | Clothing | 1 | 49.41 | 49.41 | Debit Card | 49.41 |
13 | 16FEB2024 | 13 | 18 | 127 | Electronics | 3 | 31.34 | 94.02 | PayPal | 31.34 |
14 | 09JUL2024 | 14 | 17 | 362 | Electronics | 1 | 13.35 | 13.35 | PayPal | 13.35 |
15 | 10JUL2024 | 15 | 14 | 285 | Clothing | 2 | 18.59 | 37.18 | Net Banking | 18.59 |
16 | 21NOV2024 | 16 | 13 | 218 | Clothing | 1 | 70.01 | 70.01 | PayPal | 70.01 |
17 | 07MAR2024 | 17 | 3 | 85 | Electronics | 4 | 86.80 | 347.20 | PayPal | 86.80 |
18 | 13DEC2024 | 18 | 17 | 72 | Sports | 5 | 15.32 | 76.60 | Net Banking | 15.32 |
19 | 18MAR2024 | 19 | 5 | 213 | Sports | 5 | 19.52 | 97.60 | Debit Card | 19.52 |
20 | 18DEC2024 | 20 | 11 | 128 | Sports | 4 | 14.64 | 58.56 | Net Banking | 14.64 |
/*Categorizing Transactions Based on Total Amount*/
data online_retail;
retain Transaction_Date Transaction_ID Customer_ID Product_ID Product_Category Quantity Unit_Price Total_Amount Payment_Method Avg_Price_Per_Unit ;
length Transaction_Size $10.;
set online_retail;
if Total_Amount < 50 then Transaction_Size = 'Small';
else if Total_Amount < 150 then Transaction_Size = 'Medium';
else Transaction_Size = 'Large';
run;
proc print;run;
Output:
Obs | Transaction_Date | Transaction_ID | Customer_ID | Product_ID | Product_Category | Quantity | Unit_Price | Total_Amount | Payment_Method | Avg_Price_Per_Unit | Transaction_Size |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 26DEC2024 | 1 | 14 | 462 | Books | 5 | 64.27 | 321.35 | Debit Card | 64.27 | Large |
2 | 11AUG2024 | 2 | 16 | 273 | Home & Kitchen | 4 | 17.75 | 71.00 | Credit Card | 17.75 | Medium |
3 | 02DEC2024 | 3 | 11 | 339 | Home & Kitchen | 4 | 53.63 | 214.52 | PayPal | 53.63 | Large |
4 | 25FEB2024 | 4 | 13 | 349 | Sports | 4 | 72.42 | 289.68 | Debit Card | 72.42 | Large |
5 | 02APR2024 | 5 | 1 | 85 | Books | 4 | 65.45 | 261.80 | Net Banking | 65.45 | Large |
6 | 18JAN2024 | 6 | 3 | 433 | Electronics | 2 | 90.40 | 180.80 | Debit Card | 90.40 | Large |
7 | 03JUN2024 | 7 | 7 | 463 | Sports | 1 | 25.71 | 25.71 | PayPal | 25.71 | Small |
8 | 25OCT2024 | 8 | 7 | 261 | Books | 1 | 50.36 | 50.36 | PayPal | 50.36 | Medium |
9 | 30JUL2024 | 9 | 15 | 46 | Sports | 5 | 93.59 | 467.95 | Credit Card | 93.59 | Large |
10 | 25JUN2024 | 10 | 8 | 127 | Clothing | 5 | 53.01 | 265.05 | Debit Card | 53.01 | Large |
11 | 19MAR2024 | 11 | 2 | 466 | Sports | 1 | 96.46 | 96.46 | Debit Card | 96.46 | Medium |
12 | 14JUL2024 | 12 | 4 | 479 | Clothing | 1 | 49.41 | 49.41 | Debit Card | 49.41 | Small |
13 | 16FEB2024 | 13 | 18 | 127 | Electronics | 3 | 31.34 | 94.02 | PayPal | 31.34 | Medium |
14 | 09JUL2024 | 14 | 17 | 362 | Electronics | 1 | 13.35 | 13.35 | PayPal | 13.35 | Small |
15 | 10JUL2024 | 15 | 14 | 285 | Clothing | 2 | 18.59 | 37.18 | Net Banking | 18.59 | Small |
16 | 21NOV2024 | 16 | 13 | 218 | Clothing | 1 | 70.01 | 70.01 | PayPal | 70.01 | Medium |
17 | 07MAR2024 | 17 | 3 | 85 | Electronics | 4 | 86.80 | 347.20 | PayPal | 86.80 | Large |
18 | 13DEC2024 | 18 | 17 | 72 | Sports | 5 | 15.32 | 76.60 | Net Banking | 15.32 | Medium |
19 | 18MAR2024 | 19 | 5 | 213 | Sports | 5 | 19.52 | 97.60 | Debit Card | 19.52 | Medium |
20 | 18DEC2024 | 20 | 11 | 128 | Sports | 4 | 14.64 | 58.56 | Net Banking | 14.64 | Medium |
/*Macro to Generate Sales Report for a Given Category*/
%macro category_sales_report(category);
proc sql;
title "Sales Report for &category";
select Product_ID,
sum(Quantity) as Total_Units_Sold,
sum(Total_Amount) as Total_Sales format=dollar12.2
from online_retail
where Product_Category = "&category"
group by Product_ID
order by Total_Sales desc;
quit;
title;
%mend;
%category_sales_report(Electronics);
Output:
Sales Report for Electronics |
Product_ID | Total_Units_Sold | Total_Sales |
---|---|---|
85 | 4 | $347.20 |
433 | 2 | $180.80 |
127 | 3 | $94.02 |
362 | 1 | $13.35 |
/*Macro to Create Separate Datasets for Each Payment Method*/
proc sql noprint;
select distinct Payment_Method into :payment1 - :payment4
from online_retail;
quit;
%macro split_by_payment;
%do i = 1 %to 4;
data payment_&&payment&i;
set online_retail;
where Payment_Method = "&&payment&i";
run;
proc print;run;
%end;
%mend;
%split_by_payment;
Output:
Obs | Transaction_Date | Transaction_ID | Customer_ID | Product_ID | Product_Category | Quantity | Unit_Price | Total_Amount | Payment_Method | Avg_Price_Per_Unit | Transaction_Size |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 11AUG2024 | 2 | 16 | 273 | Home & Kitchen | 4 | 17.75 | 71.00 | Credit Card | 17.75 | Medium |
2 | 30JUL2024 | 9 | 15 | 46 | Sports | 5 | 93.59 | 467.95 | Credit Card | 93.59 | Large |
Obs | Transaction_Date | Transaction_ID | Customer_ID | Product_ID | Product_Category | Quantity | Unit_Price | Total_Amount | Payment_Method | Avg_Price_Per_Unit | Transaction_Size |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 26DEC2024 | 1 | 14 | 462 | Books | 5 | 64.27 | 321.35 | Debit Card | 64.27 | Large |
2 | 25FEB2024 | 4 | 13 | 349 | Sports | 4 | 72.42 | 289.68 | Debit Card | 72.42 | Large |
3 | 18JAN2024 | 6 | 3 | 433 | Electronics | 2 | 90.40 | 180.80 | Debit Card | 90.40 | Large |
4 | 25JUN2024 | 10 | 8 | 127 | Clothing | 5 | 53.01 | 265.05 | Debit Card | 53.01 | Large |
5 | 19MAR2024 | 11 | 2 | 466 | Sports | 1 | 96.46 | 96.46 | Debit Card | 96.46 | Medium |
6 | 14JUL2024 | 12 | 4 | 479 | Clothing | 1 | 49.41 | 49.41 | Debit Card | 49.41 | Small |
7 | 18MAR2024 | 19 | 5 | 213 | Sports | 5 | 19.52 | 97.60 | Debit Card | 19.52 | Medium |
Obs | Transaction_Date | Transaction_ID | Customer_ID | Product_ID | Product_Category | Quantity | Unit_Price | Total_Amount | Payment_Method | Avg_Price_Per_Unit | Transaction_Size |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 02APR2024 | 5 | 1 | 85 | Books | 4 | 65.45 | 261.80 | Net Banking | 65.45 | Large |
2 | 10JUL2024 | 15 | 14 | 285 | Clothing | 2 | 18.59 | 37.18 | Net Banking | 18.59 | Small |
3 | 13DEC2024 | 18 | 17 | 72 | Sports | 5 | 15.32 | 76.60 | Net Banking | 15.32 | Medium |
4 | 18DEC2024 | 20 | 11 | 128 | Sports | 4 | 14.64 | 58.56 | Net Banking | 14.64 | Medium |
Obs | Transaction_Date | Transaction_ID | Customer_ID | Product_ID | Product_Category | Quantity | Unit_Price | Total_Amount | Payment_Method | Avg_Price_Per_Unit | Transaction_Size |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 02DEC2024 | 3 | 11 | 339 | Home & Kitchen | 4 | 53.63 | 214.52 | PayPal | 53.63 | Large |
2 | 03JUN2024 | 7 | 7 | 463 | Sports | 1 | 25.71 | 25.71 | PayPal | 25.71 | Small |
3 | 25OCT2024 | 8 | 7 | 261 | Books | 1 | 50.36 | 50.36 | PayPal | 50.36 | Medium |
4 | 16FEB2024 | 13 | 18 | 127 | Electronics | 3 | 31.34 | 94.02 | PayPal | 31.34 | Medium |
5 | 09JUL2024 | 14 | 17 | 362 | Electronics | 1 | 13.35 | 13.35 | PayPal | 13.35 | Small |
6 | 21NOV2024 | 16 | 13 | 218 | Clothing | 1 | 70.01 | 70.01 | PayPal | 70.01 | Medium |
7 | 07MAR2024 | 17 | 3 | 85 | Electronics | 4 | 86.80 | 347.20 | PayPal | 86.80 | Large |
/*Bar Chart of Sales by Product Category*/
proc sgplot data=online_retail;
vbar Product_Category / response=Total_Amount stat=sum datalabel;
yaxis label="Total Sales";
title "Total Sales by Product Category";
run;
Log:
NOTE: PROCEDURE SGPLOT used (Total process time):
real time 2.79 seconds
cpu time 0.53 seconds
NOTE: Listing image output written to SGPlot1.png.
NOTE: There were 20 observations read from the data set WORK.ONLINE_RETAIL.
/*Line Chart of Monthly Sales Trend*/
proc sql;
create table monthly_sales as
select month(Transaction_Date) as Month,
sum(Total_Amount) as Monthly_Sales
from online_retail
group by calculated Month
order by calculated Month;
quit;
proc sgplot data=monthly_sales;
series x=Month y=Monthly_Sales / markers;
xaxis label="Month";
yaxis label="Sales";
title "Monthly Sales Trend";
run;
Log:
NOTE: PROCEDURE SGPLOT used (Total process time):
real time 0.53 seconds
cpu time 0.07 seconds
NOTE: Listing image output written to SGPlot3.png.
NOTE: There were 10 observations read from the data set WORK.MONTHLY_SALES.
/*Generating Reports*/
ods pdf file="Online_Retail_Report.pdf";
proc report data=online_retail nowd;
column Product_Category Payment_Method Total_Amount;
define Product_Category / group;
define Payment_Method / group;
define Total_Amount / analysis sum format=dollar12.2;
title "Sales Report by Category and Payment Method";
run;
ods pdf close;
Sales Report by Category and Payment Method |
Product_Category | Payment_Method | Total_Amount |
---|---|---|
Books | Debit Card | $321.35 |
Net Banking | $261.80 | |
PayPal | $50.36 | |
Clothing | Debit Card | $314.46 |
Net Banking | $37.18 | |
PayPal | $70.01 | |
Electronics | Debit Card | $180.80 |
PayPal | $454.57 | |
Home & Kitchen | Credit Card | $71.00 |
PayPal | $214.52 | |
Sports | Credit Card | $467.95 |
Debit Card | $483.74 | |
Net Banking | $135.16 | |
PayPal | $25.71 |
- Get link
- X
- Other Apps
Comments
Post a Comment