148.ENHANCING BOOKSTORE SALES ANALYSIS WITH SAS: A COMPREHENSIVE GUIDE TO DATA PROCESSING AND INSIGHTFUL REPORTING

ENHANCING BOOKSTORE SALES ANALYSIS WITH SAS: A COMPREHENSIVE GUIDE TO DATA PROCESSING AND INSIGHTFUL REPORTING


/*Step 1: Creating the Initial Dataset*/

/*Creating a dataset named bookstore_sales that contains the following variables:*/

BookID: Unique identifier for each book.

Title: Title of the book.

Author: Name of the author.

Genre: Genre of the book.

Price: Price of the book.

CopiesSold: Number of copies sold.

Rating: Average customer rating (out of 5).

PublicationDate: Date the book was published.


data bookstore_sales;

    infile datalines truncover;

    input BookID Title & $50. Author & $30. Genre & $20. Price CopiesSold Rating          PublicationDate :date9.;

    format PublicationDate date9.;

    datalines;

1  The Art of SAS  John Doe  Technology  49.99  150  4.5  15MAR2020

2  Cooking 101  Jane Smith  Cooking  29.95  200  4.7  22JUN2018

3  Mystery Manor  Alan Poe  Mystery  19.99  120  4.3  05OCT2019

4  Historical Facts  Mary Johnson  History  34.50  80  4.6  12JAN2021

5  Sci-Fi Adventures  Luke Skywalker  Science Fiction  25.00  300  4.8  30JUL2017

;

run;

proc print data=bookstore_sales;

    title "Bookstore Sales Data";

run;

Output:

Obs BookID Title Author Genre Price CopiesSold Rating PublicationDate
1 1 The Art of SAS John Doe Technology 49.99 150 4.5 15MAR2020
2 2 Cooking 101 Jane Smith Cooking 29.95 200 4.7 22JUN2018
3 3 Mystery Manor Alan Poe Mystery 19.99 120 4.3 05OCT2019
4 4 Historical Facts Mary Johnson History 34.50 80 4.6 12JAN2021
5 5 Sci-Fi Adventures Luke Skywalker Science Fiction 25.00 300 4.8 30JUL2017


/*Step 2: Exploring the Dataset*/

/*To understand the structure and contents of our dataset, we can use the PROC CONTENTS procedure:*/

proc contents data=bookstore_sales;

    title "Dataset Structure for Bookstore Sales";

run;

Output:

                                                    Dataset Structure for Bookstore Sales

                                                          The CONTENTS Procedure

Data Set Name WORK.BOOKSTORE_SALES Observations 5
Member Type DATA Variables 8
Engine V9 Indexes 0
Created 14/09/2015 00:52:48 Observation Length 144
Last Modified 14/09/2015 00:52:48 Deleted Observations 0
Protection   Compressed NO
Data Set Type   Sorted NO
Label      
Data Representation WINDOWS_64    
Encoding wlatin1 Western (Windows)    


Engine/Host Dependent Information
Data Set Page Size 65536
Number of Data Set Pages 1
First Data Page 1
Max Obs per Page 454
Obs in First Data Page 5
Number of Data Set Repairs 0
ExtendObsCounter YES
Filename C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD10152_DESKTOP-QFAA4KV_\bookstore_sales.sas7bdat
Release Created 9.0401M2
Host Created X64_8HOME


Alphabetic List of Variables and Attributes
# Variable Type Len Format
3 Author Char 30  
1 BookID Num 8  
6 CopiesSold Num 8  
4 Genre Char 20  
5 Price Num 8  
8 PublicationDate Num 8 DATE9.
7 Rating Num 8  
2 Title Char 50  


/*Step 3: Data Manipulation Techniques*/

/*Adding New Variables*/

/*Suppose we want to calculate the total revenue for each book and classify the sales performance. We can add two new variables: TotalRevenue and SalesCategory.*/

data bookstore_sales;

    set bookstore_sales;

    TotalRevenue = Price * CopiesSold;

    if CopiesSold > 250 then SalesCategory = 'Best Seller';

    else if CopiesSold between 100 and 250 then SalesCategory = 'Average Seller';

    else SalesCategory = 'Low Seller';

run;

proc print data=bookstore_sales;

run;

Output:

Obs BookID Title Author Genre Price CopiesSold Rating PublicationDate TotalRevenue SalesCategory
1 1 The Art of SAS John Doe Technology 49.99 150 4.5 15MAR2020 7498.5 Average Seller
2 2 Cooking 101 Jane Smith Cooking 29.95 200 4.7 22JUN2018 5990.0 Average Seller
3 3 Mystery Manor Alan Poe Mystery 19.99 120 4.3 05OCT2019 2398.8 Average Seller
4 4 Historical Facts Mary Johnson History 34.50 80 4.6 12JAN2021 2760.0 Low Seller
5 5 Sci-Fi Adventures Luke Skywalker Science Fiction 25.00 300 4.8 30JUL2017 7500.0 Best Seller


/*Sorting Data*/

/*To sort the books by TotalRevenue in descending order:*/

proc sort data=bookstore_sales;

    by descending TotalRevenue;

run;

proc print data=bookstore_sales;

run;

Output:

Obs BookID Title Author Genre Price CopiesSold Rating PublicationDate TotalRevenue SalesCategory
1 5 Sci-Fi Adventures Luke Skywalker Science Fiction 25.00 300 4.8 30JUL2017 7500.0 Best Seller
2 1 The Art of SAS John Doe Technology 49.99 150 4.5 15MAR2020 7498.5 Average Seller
3 2 Cooking 101 Jane Smith Cooking 29.95 200 4.7 22JUN2018 5990.0 Average Seller
4 4 Historical Facts Mary Johnson History 34.50 80 4.6 12JAN2021 2760.0 Low Seller
5 3 Mystery Manor Alan Poe Mystery 19.99 120 4.3 05OCT2019 2398.8 Average Seller


/*Step 4: Aggregating Data*/

/*To analyze the average rating and total copies sold per genre, we can use PROC MEANS:*/

proc means data=bookstore_sales ;

    class Genre;

    var Rating CopiesSold;

    output out=genre_summary mean(Rating)=AvgRating sum(CopiesSold)=TotalCopiesSold;

run;

Output:

                                                                     The MEANS Procedure

Genre N Obs Variable N Mean Std Dev Minimum Maximum
Cooking 1
Rating
CopiesSold
1
1
4.7000000
200.0000000
.
.
4.7000000
200.0000000
4.7000000
200.0000000
History 1
Rating
CopiesSold
1
1
4.6000000
80.0000000
.
.
4.6000000
80.0000000
4.6000000
80.0000000
Mystery 1
Rating
CopiesSold
1
1
4.3000000
120.0000000
.
.
4.3000000
120.0000000
4.3000000
120.0000000
Science Fiction 1
Rating
CopiesSold
1
1
4.8000000
300.0000000
.
.
4.8000000
300.0000000
4.8000000
300.0000000
Technology 1
Rating
CopiesSold
1
1
4.5000000
150.0000000
.
.
4.5000000
150.0000000
4.5000000
150.0000000


/*Step 5: Data Visualization*/

/*Visualizing data can provide insights that are not immediately apparent from tables. SAS offers various procedures for creating plots and charts.*/

/*Bar Chart of Total Revenue by Genre*/

proc sgplot data=bookstore_sales;

    vbar Genre / response=TotalRevenue stat=sum;

    title "Total Revenue by Genre";

run;


/*Scatter Plot of Rating vs. Price*/

proc sgplot data=bookstore_sales;

    scatter x=Price y=Rating / datalabel=Title;

    title "Customer Rating vs. Book Price";

run;


/*Step 6: Advanced Data Techniques*/

/*Merging Datasets*/

/*If we have another dataset containing information about the publishers, we can merge it with our bookstore_sales dataset.*/

/*Assuming we have a dataset publishers:*/

data publishers;

    input PublisherID PublisherName $;

    datalines;

1 "Tech Books Publishing"

2 "Culinary Press"

3 "Mystery House"

4 "Historical Publications"

5 "Sci-Fi World"

;

run;

proc print;

run;

Output:

Obs PublisherID PublisherName
1 1 "Tech
2 2 "Culinar
3 3 "Mystery
4 4 "Histori
5 5 "Sci-Fi

data book_publishers;

    input BookID PublisherID;

    datalines;

1 1

2 2

3 3

4 4

5 5

;

run;

proc print;

run;

Output:

Obs BookID PublisherID
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5

/*We can merge these datasets to include publisher information in our sales data:*/

proc sort data=bookstore_sales;

    by BookID;

run;

proc print;

run;

Output:

Obs BookID Title Author Genre Price CopiesSold Rating PublicationDate TotalRevenue SalesCategory
1 1 The Art of SAS John Doe Technology 49.99 150 4.5 15MAR2020 7498.5 Average Seller
2 2 Cooking 101 Jane Smith Cooking 29.95 200 4.7 22JUN2018 5990.0 Average Seller
3 3 Mystery Manor Alan Poe Mystery 19.99 120 4.3 05OCT2019 2398.8 Average Seller
4 4 Historical Facts Mary Johnson History 34.50 80 4.6 12JAN2021 2760.0 Low Seller
5 5 Sci-Fi Adventures Luke Skywalker Science Fiction 25.00 300 4.8 30JUL2017 7500.0 Best Seller


proc sort data=book_publishers;

    by BookID;

run;

proc print;

run;

Output:

Obs BookID PublisherID
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5


data bookstore_sales_with_publishers;

    merge bookstore_sales(in=a) book_publishers(in=b);

    by BookID;

    if a and b;

run;

proc print;

run;

Output:

Obs BookID Title Author Genre Price CopiesSold Rating PublicationDate TotalRevenue SalesCategory PublisherID
1 1 The Art of SAS John Doe Technology 49.99 150 4.5 15MAR2020 7498.5 Average Seller 1
2 2 Cooking 101 Jane Smith Cooking 29.95 200 4.7 22JUN2018 5990.0 Average Seller 2
3 3 Mystery Manor Alan Poe Mystery 19.99 120 4.3 05OCT2019 2398.8 Average Seller 3
4 4 Historical Facts Mary Johnson History 34.50 80 4.6 12JAN2021 2760.0 Low Seller 4
5 5 Sci-Fi Adventures Luke Skywalker Science Fiction 25.00 300 4.8 30JUL2017 7500.0 Best Seller 5


proc sort data=bookstore_sales_with_publishers;

    by PublisherID;

run;

proc print;

run;

Output:

Obs BookID Title Author Genre Price CopiesSold Rating PublicationDate TotalRevenue SalesCategory PublisherID
1 1 The Art of SAS John Doe Technology 49.99 150 4.5 15MAR2020 7498.5 Average Seller 1
2 2 Cooking 101 Jane Smith Cooking 29.95 200 4.7 22JUN2018 5990.0 Average Seller 2
3 3 Mystery Manor Alan Poe Mystery 19.99 120 4.3 05OCT2019 2398.8 Average Seller 3
4 4 Historical Facts Mary Johnson History 34.50 80 4.6 12JAN2021 2760.0 Low Seller 4
5 5 Sci-Fi Adventures Luke Skywalker Science Fiction 25.00 300 4.8 30JUL2017 7500.0 Best Seller 5


data final_dataset;

    merge bookstore_sales_with_publishers(in=a) publishers(in=b);

    by PublisherID;

    if a and b;

run;

proc print;

run;

Output:

Obs BookID Title Author Genre Price CopiesSold Rating PublicationDate TotalRevenue SalesCategory PublisherID PublisherName
1 1 The Art of SAS John Doe Technology 49.99 150 4.5 15MAR2020 7498.5 Average Seller 1 "Tech
2 2 Cooking 101 Jane Smith Cooking 29.95 200 4.7 22JUN2018 5990.0 Average Seller 2 "Culinar
3 3 Mystery Manor Alan Poe Mystery 19.99 120 4.3 05OCT2019 2398.8 Average Seller 3 "Mystery
4 4 Historical Facts Mary Johnson History 34.50 80 4.6 12JAN2021 2760.0 Low Seller 4 "Histori
5 5 Sci-Fi Adventures Luke Skywalker Science Fiction 25.00 300 4.8 30JUL2017 7500.0 Best Seller 5 "Sci-Fi


PRACTICE AND COMMENT YOUR CODE: 

-->PLEASE FOLLOW OUR BLOG FOR MORE UPDATES.

TO FOLLOW OUR TELEGRAM CHANNEL CLICK HERE

Comments