168.COMPREHENSIVE ONLINE RETAIL ANALYSIS USING SAS | DATA GENERATION WITH DATA STEP | DATA EXPLORATION USING PROC CONTENTS | PROC PRINT | PROC MEANS | DATA TRANSFORMATION VIA DATA STEP | ADVANCED ANALYSIS WITH PROC SQL | AUTOMATION THROUGH MACROS | VISUALIZATION USING PROC SGPLOT | REPORTING WITH PROC REPORT | A STEP-BY-STEP GUIDE TO MASTERING SAS FOR ONLINE RETAIL DATA

COMPREHENSIVE ONLINE RETAIL ANALYSIS USING SAS | DATA GENERATION WITH DATA STEP | DATA EXPLORATION USING PROC CONTENTS | PROC PRINT | PROC MEANS | DATA TRANSFORMATION VIA DATA STEP | ADVANCED ANALYSIS WITH PROC SQL | AUTOMATION THROUGH MACROS | VISUALIZATION USING PROC SGPLOT | REPORTING WITH PROC REPORT | A STEP-BY-STEP GUIDE TO MASTERING SAS FOR ONLINE RETAIL DATA

/*Creating a unique dataset related to Online Retail Transactions*/

/*Dataset Overview: Online Retail Transactions*/

We'll simulate an online retail dataset capturing customer purchases over a year. The dataset will include:

Transaction_ID: Unique identifier for each transaction.

Customer_ID: Unique identifier for each customer.

Transaction_Date: Date of the transaction.

Product_ID: Unique identifier for each product.

Product_Category: Category of the product (e.g., Electronics, Clothing).

Quantity: Number of units purchased.

Unit_Price: Price per unit of the product.

Total_Amount: Total amount for the transaction (Quantity × Unit_Price).

Payment_Method: Method of payment used (e.g., Credit Card, PayPal).


/*Data Creation*/

data online_retail;

    call streaminit(123); /* For reproducibility */

    format Transaction_Date date9.;

    array categories[5] $20 _temporary_ ('Electronics', 'Clothing', 'Home & Kitchen', 'Books', 'Sports');

    array payments[4] $20 _temporary_ ('Credit Card', 'PayPal', 'Debit Card', 'Net Banking');


    do Transaction_ID = 1 to 20;

        Customer_ID = ceil(ranuni(0) * 20); /* 20 unique customers */

        Transaction_Date = '01JAN2024'd + ceil(ranuni(0) * 364);

        Product_ID = ceil(ranuni(0) * 500); /* 500 unique products */

        Product_Category = categories[ceil(ranuni(0) * dim(categories))];

        Quantity = ceil(ranuni(0) * 5);

        Unit_Price = round(10 + ranuni(0) * 90, 0.01); /* Prices between $10 and $100 */

        Total_Amount = Quantity * Unit_Price;

        Payment_Method = payments[ceil(ranuni(0) * dim(payments))];

        output;

    end;

run;

proc print;run;

Output:

Obs Transaction_Date Transaction_ID Customer_ID Product_ID Product_Category Quantity Unit_Price Total_Amount Payment_Method
1 26DEC2024 1 14 462 Books 5 64.27 321.35 Debit Card
2 11AUG2024 2 16 273 Home & Kitchen 4 17.75 71.00 Credit Card
3 02DEC2024 3 11 339 Home & Kitchen 4 53.63 214.52 PayPal
4 25FEB2024 4 13 349 Sports 4 72.42 289.68 Debit Card
5 02APR2024 5 1 85 Books 4 65.45 261.80 Net Banking
6 18JAN2024 6 3 433 Electronics 2 90.40 180.80 Debit Card
7 03JUN2024 7 7 463 Sports 1 25.71 25.71 PayPal
8 25OCT2024 8 7 261 Books 1 50.36 50.36 PayPal
9 30JUL2024 9 15 46 Sports 5 93.59 467.95 Credit Card
10 25JUN2024 10 8 127 Clothing 5 53.01 265.05 Debit Card
11 19MAR2024 11 2 466 Sports 1 96.46 96.46 Debit Card
12 14JUL2024 12 4 479 Clothing 1 49.41 49.41 Debit Card
13 16FEB2024 13 18 127 Electronics 3 31.34 94.02 PayPal
14 09JUL2024 14 17 362 Electronics 1 13.35 13.35 PayPal
15 10JUL2024 15 14 285 Clothing 2 18.59 37.18 Net Banking
16 21NOV2024 16 13 218 Clothing 1 70.01 70.01 PayPal
17 07MAR2024 17 3 85 Electronics 4 86.80 347.20 PayPal
18 13DEC2024 18 17 72 Sports 5 15.32 76.60 Net Banking
19 18MAR2024 19 5 213 Sports 5 19.52 97.60 Debit Card
20 18DEC2024 20 11 128 Sports 4 14.64 58.56 Net Banking

/*View Dataset Structure*/

proc contents data=online_retail;

run;

Output:

                                                                       The CONTENTS Procedure

Data Set Name WORK.ONLINE_RETAIL Observations 20
Member Type DATA Variables 9
Engine V9 Indexes 0
Created 14/09/2015 00:13:14 Observation Length 96
Last Modified 14/09/2015 00:13:14 Deleted Observations 0
Protection   Compressed NO
Data Set Type   Sorted NO
Label      
Data Representation WINDOWS_64    
Encoding wlatin1 Western (Windows)    


Engine/Host Dependent Information
Data Set Page Size 65536
Number of Data Set Pages 1
First Data Page 1
Max Obs per Page 681
Obs in First Data Page 20
Number of Data Set Repairs 0
ExtendObsCounter YES
Filename C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD10624_DESKTOP-QFAA4KV_\online_retail.sas7bdat
Release Created 9.0401M2
Host Created X64_8HOME


Alphabetic List of Variables and Attributes
# Variable Type Len Format
3 Customer_ID Num 8  
9 Payment_Method Char 20  
5 Product_Category Char 20  
4 Product_ID Num 8  
6 Quantity Num 8  
8 Total_Amount Num 8  
1 Transaction_Date Num 8 DATE9.
2 Transaction_ID Num 8  
7 Unit_Price Num 8  


proc print data=online_retail(obs=10);

run;

Output:

Obs Transaction_Date Transaction_ID Customer_ID Product_ID Product_Category Quantity Unit_Price Total_Amount Payment_Method
1 26DEC2024 1 14 462 Books 5 64.27 321.35 Debit Card
2 11AUG2024 2 16 273 Home & Kitchen 4 17.75 71.00 Credit Card
3 02DEC2024 3 11 339 Home & Kitchen 4 53.63 214.52 PayPal
4 25FEB2024 4 13 349 Sports 4 72.42 289.68 Debit Card
5 02APR2024 5 1 85 Books 4 65.45 261.80 Net Banking
6 18JAN2024 6 3 433 Electronics 2 90.40 180.80 Debit Card
7 03JUN2024 7 7 463 Sports 1 25.71 25.71 PayPal
8 25OCT2024 8 7 261 Books 1 50.36 50.36 PayPal
9 30JUL2024 9 15 46 Sports 5 93.59 467.95 Credit Card
10 25JUN2024 10 8 127 Clothing 5 53.01 265.05 Debit Card

/*Summary Statistics*/

proc means data=online_retail n mean std min max;

    var Quantity Unit_Price Total_Amount;

run;

Output:

                                                                           The MEANS Procedure

Variable N Mean Std Dev Minimum Maximum
Quantity
Unit_Price
Total_Amount
20
20
20
3.1000000
50.1015000
154.4305000
1.6511559
29.1123554
130.0478325
1.0000000
13.3500000
13.3500000
5.0000000
96.4600000
467.9500000

/*Total Sales per Product Category*/

proc sql;

    select Product_Category, 

           sum(Total_Amount) as Total_Sales format=dollar12.2

    from online_retail

    group by Product_Category

    order by Total_Sales desc;

quit;

Output:

Product_Category Total_Sales
Sports $1,112.56
Electronics $635.37
Books $633.51
Clothing $421.65
Home & Kitchen $285.52

/*Top 5 Customers by Total Spend*/

proc sql outobs=5;

    select Customer_ID, 

           sum(Total_Amount) as Total_Spend format=dollar12.2

    from online_retail

    group by Customer_ID

    order by Total_Spend desc;

quit;

Output:

Customer_ID Total_Spend
3 $528.00
15 $467.95
13 $359.69
14 $358.53
11 $273.08

/*Monthly Sales Trend*/

proc sql;

    select month(Transaction_Date) as Month, 

           sum(Total_Amount) as Monthly_Sales format=dollar12.2

    from online_retail

    group by calculated Month

    order by calculated Month;

quit;

Output:

Month Monthly_Sales
1 $180.80
2 $383.70
3 $541.26
4 $261.80
6 $290.76
7 $567.89
8 $71.00
10 $50.36
11 $70.01
12 $671.03

/*Creating a New Variable: Average Price per Unit*/

data online_retail;

    set online_retail;

    Avg_Price_Per_Unit = Total_Amount / Quantity;

run;

proc print;run;

Output:

Obs Transaction_Date Transaction_ID Customer_ID Product_ID Product_Category Quantity Unit_Price Total_Amount Payment_Method Avg_Price_Per_Unit
1 26DEC2024 1 14 462 Books 5 64.27 321.35 Debit Card 64.27
2 11AUG2024 2 16 273 Home & Kitchen 4 17.75 71.00 Credit Card 17.75
3 02DEC2024 3 11 339 Home & Kitchen 4 53.63 214.52 PayPal 53.63
4 25FEB2024 4 13 349 Sports 4 72.42 289.68 Debit Card 72.42
5 02APR2024 5 1 85 Books 4 65.45 261.80 Net Banking 65.45
6 18JAN2024 6 3 433 Electronics 2 90.40 180.80 Debit Card 90.40
7 03JUN2024 7 7 463 Sports 1 25.71 25.71 PayPal 25.71
8 25OCT2024 8 7 261 Books 1 50.36 50.36 PayPal 50.36
9 30JUL2024 9 15 46 Sports 5 93.59 467.95 Credit Card 93.59
10 25JUN2024 10 8 127 Clothing 5 53.01 265.05 Debit Card 53.01
11 19MAR2024 11 2 466 Sports 1 96.46 96.46 Debit Card 96.46
12 14JUL2024 12 4 479 Clothing 1 49.41 49.41 Debit Card 49.41
13 16FEB2024 13 18 127 Electronics 3 31.34 94.02 PayPal 31.34
14 09JUL2024 14 17 362 Electronics 1 13.35 13.35 PayPal 13.35
15 10JUL2024 15 14 285 Clothing 2 18.59 37.18 Net Banking 18.59
16 21NOV2024 16 13 218 Clothing 1 70.01 70.01 PayPal 70.01
17 07MAR2024 17 3 85 Electronics 4 86.80 347.20 PayPal 86.80
18 13DEC2024 18 17 72 Sports 5 15.32 76.60 Net Banking 15.32
19 18MAR2024 19 5 213 Sports 5 19.52 97.60 Debit Card 19.52
20 18DEC2024 20 11 128 Sports 4 14.64 58.56 Net Banking 14.64

/*Categorizing Transactions Based on Total Amount*/

data online_retail;

    retain Transaction_Date Transaction_ID Customer_ID Product_ID Product_Category Quantity Unit_Price Total_Amount Payment_Method Avg_Price_Per_Unit ;

   length Transaction_Size $10.;

    set online_retail;

    if Total_Amount < 50 then Transaction_Size = 'Small';

    else if Total_Amount < 150 then Transaction_Size = 'Medium';

    else Transaction_Size = 'Large';

run;

proc print;run;

Output:

Obs Transaction_Date Transaction_ID Customer_ID Product_ID Product_Category Quantity Unit_Price Total_Amount Payment_Method Avg_Price_Per_Unit Transaction_Size
1 26DEC2024 1 14 462 Books 5 64.27 321.35 Debit Card 64.27 Large
2 11AUG2024 2 16 273 Home & Kitchen 4 17.75 71.00 Credit Card 17.75 Medium
3 02DEC2024 3 11 339 Home & Kitchen 4 53.63 214.52 PayPal 53.63 Large
4 25FEB2024 4 13 349 Sports 4 72.42 289.68 Debit Card 72.42 Large
5 02APR2024 5 1 85 Books 4 65.45 261.80 Net Banking 65.45 Large
6 18JAN2024 6 3 433 Electronics 2 90.40 180.80 Debit Card 90.40 Large
7 03JUN2024 7 7 463 Sports 1 25.71 25.71 PayPal 25.71 Small
8 25OCT2024 8 7 261 Books 1 50.36 50.36 PayPal 50.36 Medium
9 30JUL2024 9 15 46 Sports 5 93.59 467.95 Credit Card 93.59 Large
10 25JUN2024 10 8 127 Clothing 5 53.01 265.05 Debit Card 53.01 Large
11 19MAR2024 11 2 466 Sports 1 96.46 96.46 Debit Card 96.46 Medium
12 14JUL2024 12 4 479 Clothing 1 49.41 49.41 Debit Card 49.41 Small
13 16FEB2024 13 18 127 Electronics 3 31.34 94.02 PayPal 31.34 Medium
14 09JUL2024 14 17 362 Electronics 1 13.35 13.35 PayPal 13.35 Small
15 10JUL2024 15 14 285 Clothing 2 18.59 37.18 Net Banking 18.59 Small
16 21NOV2024 16 13 218 Clothing 1 70.01 70.01 PayPal 70.01 Medium
17 07MAR2024 17 3 85 Electronics 4 86.80 347.20 PayPal 86.80 Large
18 13DEC2024 18 17 72 Sports 5 15.32 76.60 Net Banking 15.32 Medium
19 18MAR2024 19 5 213 Sports 5 19.52 97.60 Debit Card 19.52 Medium
20 18DEC2024 20 11 128 Sports 4 14.64 58.56 Net Banking 14.64 Medium

/*Macro to Generate Sales Report for a Given Category*/

%macro category_sales_report(category);

    proc sql;

        title "Sales Report for &category";

        select Product_ID, 

               sum(Quantity) as Total_Units_Sold,

               sum(Total_Amount) as Total_Sales format=dollar12.2

        from online_retail

        where Product_Category = "&category"

        group by Product_ID

        order by Total_Sales desc;

    quit;

    title;

%mend;


%category_sales_report(Electronics);

Output:

                                                         Sales Report for Electronics

Product_ID Total_Units_Sold Total_Sales
85 4 $347.20
433 2 $180.80
127 3 $94.02
362 1 $13.35

/*Macro to Create Separate Datasets for Each Payment Method*/

proc sql noprint;

    select distinct Payment_Method into :payment1 - :payment4

    from online_retail;

quit;


%macro split_by_payment;

    %do i = 1 %to 4;

        data payment_&&payment&i;

            set online_retail;

            where Payment_Method = "&&payment&i";

        run;

        proc print;run;

    %end;

%mend;


%split_by_payment;

Output:


Obs Transaction_Date Transaction_ID Customer_ID Product_ID Product_Category Quantity Unit_Price Total_Amount Payment_Method Avg_Price_Per_Unit Transaction_Size
1 11AUG2024 2 16 273 Home & Kitchen 4 17.75 71.00 Credit Card 17.75 Medium
2 30JUL2024 9 15 46 Sports 5 93.59 467.95 Credit Card 93.59 Large



Obs Transaction_Date Transaction_ID Customer_ID Product_ID Product_Category Quantity Unit_Price Total_Amount Payment_Method Avg_Price_Per_Unit Transaction_Size
1 26DEC2024 1 14 462 Books 5 64.27 321.35 Debit Card 64.27 Large
2 25FEB2024 4 13 349 Sports 4 72.42 289.68 Debit Card 72.42 Large
3 18JAN2024 6 3 433 Electronics 2 90.40 180.80 Debit Card 90.40 Large
4 25JUN2024 10 8 127 Clothing 5 53.01 265.05 Debit Card 53.01 Large
5 19MAR2024 11 2 466 Sports 1 96.46 96.46 Debit Card 96.46 Medium
6 14JUL2024 12 4 479 Clothing 1 49.41 49.41 Debit Card 49.41 Small
7 18MAR2024 19 5 213 Sports 5 19.52 97.60 Debit Card 19.52 Medium



Obs Transaction_Date Transaction_ID Customer_ID Product_ID Product_Category Quantity Unit_Price Total_Amount Payment_Method Avg_Price_Per_Unit Transaction_Size
1 02APR2024 5 1 85 Books 4 65.45 261.80 Net Banking 65.45 Large
2 10JUL2024 15 14 285 Clothing 2 18.59 37.18 Net Banking 18.59 Small
3 13DEC2024 18 17 72 Sports 5 15.32 76.60 Net Banking 15.32 Medium
4 18DEC2024 20 11 128 Sports 4 14.64 58.56 Net Banking 14.64 Medium



Obs Transaction_Date Transaction_ID Customer_ID Product_ID Product_Category Quantity Unit_Price Total_Amount Payment_Method Avg_Price_Per_Unit Transaction_Size
1 02DEC2024 3 11 339 Home & Kitchen 4 53.63 214.52 PayPal 53.63 Large
2 03JUN2024 7 7 463 Sports 1 25.71 25.71 PayPal 25.71 Small
3 25OCT2024 8 7 261 Books 1 50.36 50.36 PayPal 50.36 Medium
4 16FEB2024 13 18 127 Electronics 3 31.34 94.02 PayPal 31.34 Medium
5 09JUL2024 14 17 362 Electronics 1 13.35 13.35 PayPal 13.35 Small
6 21NOV2024 16 13 218 Clothing 1 70.01 70.01 PayPal 70.01 Medium
7 07MAR2024 17 3 85 Electronics 4 86.80 347.20 PayPal 86.80 Large

/*Bar Chart of Sales by Product Category*/

proc sgplot data=online_retail;

    vbar Product_Category / response=Total_Amount stat=sum datalabel;

    yaxis label="Total Sales";

    title "Total Sales by Product Category";

run;

Log:

NOTE: PROCEDURE SGPLOT used (Total process time):

      real time           2.79 seconds

      cpu time            0.53 seconds


NOTE: Listing image output written to SGPlot1.png.

NOTE: There were 20 observations read from the data set WORK.ONLINE_RETAIL.


/*Line Chart of Monthly Sales Trend*/

proc sql;

    create table monthly_sales as

    select month(Transaction_Date) as Month, 

           sum(Total_Amount) as Monthly_Sales

    from online_retail

    group by calculated Month

    order by calculated Month;

quit;


proc sgplot data=monthly_sales;

    series x=Month y=Monthly_Sales / markers;

    xaxis label="Month";

    yaxis label="Sales";

    title "Monthly Sales Trend";

run;

Log:

NOTE: PROCEDURE SGPLOT used (Total process time):

      real time           0.53 seconds

      cpu time            0.07 seconds


NOTE: Listing image output written to SGPlot3.png.

NOTE: There were 10 observations read from the data set WORK.MONTHLY_SALES.


/*Generating Reports*/

ods pdf file="Online_Retail_Report.pdf";


proc report data=online_retail nowd;

    column Product_Category Payment_Method Total_Amount;

    define Product_Category / group;

    define Payment_Method / group;

    define Total_Amount / analysis sum format=dollar12.2;

    title "Sales Report by Category and Payment Method";

run;


ods pdf close;

Output:
                                             Sales Report by Category and Payment Method

Product_Category Payment_Method Total_Amount
Books Debit Card $321.35
  Net Banking $261.80
  PayPal $50.36
Clothing Debit Card $314.46
  Net Banking $37.18
  PayPal $70.01
Electronics Debit Card $180.80
  PayPal $454.57
Home & Kitchen Credit Card $71.00
  PayPal $214.52
Sports Credit Card $467.95
  Debit Card $483.74
  Net Banking $135.16
  PayPal $25.71


PRACTICE AND COMMENT YOUR CODE: 

--->PLEASE FOLLOW OUR BLOG FOR MORE UPDATES.

PLEASE FOLLOW OUR TELEGRAM CHANNEL CLICK HERE

PLEASE FOLLOW OUR FACEBOOK PAGE  CLICK HERE

Comments