148.ENHANCING BOOKSTORE SALES ANALYSIS WITH SAS: A COMPREHENSIVE GUIDE TO DATA PROCESSING AND INSIGHTFUL REPORTING
- Get link
- X
- Other Apps
ENHANCING BOOKSTORE SALES ANALYSIS WITH SAS: A COMPREHENSIVE GUIDE TO DATA PROCESSING AND INSIGHTFUL REPORTING
/*Step 1: Creating the Initial Dataset*/
/*Creating a dataset named bookstore_sales that contains the following variables:*/
BookID: Unique identifier for each book.
Title: Title of the book.
Author: Name of the author.
Genre: Genre of the book.
Price: Price of the book.
CopiesSold: Number of copies sold.
Rating: Average customer rating (out of 5).
PublicationDate: Date the book was published.
data bookstore_sales;
infile datalines truncover;
input BookID Title & $50. Author & $30. Genre & $20. Price CopiesSold Rating PublicationDate :date9.;
format PublicationDate date9.;
datalines;
1 The Art of SAS John Doe Technology 49.99 150 4.5 15MAR2020
2 Cooking 101 Jane Smith Cooking 29.95 200 4.7 22JUN2018
3 Mystery Manor Alan Poe Mystery 19.99 120 4.3 05OCT2019
4 Historical Facts Mary Johnson History 34.50 80 4.6 12JAN2021
5 Sci-Fi Adventures Luke Skywalker Science Fiction 25.00 300 4.8 30JUL2017
;
run;
proc print data=bookstore_sales;
title "Bookstore Sales Data";
run;
Output:
Obs | BookID | Title | Author | Genre | Price | CopiesSold | Rating | PublicationDate |
---|---|---|---|---|---|---|---|---|
1 | 1 | The Art of SAS | John Doe | Technology | 49.99 | 150 | 4.5 | 15MAR2020 |
2 | 2 | Cooking 101 | Jane Smith | Cooking | 29.95 | 200 | 4.7 | 22JUN2018 |
3 | 3 | Mystery Manor | Alan Poe | Mystery | 19.99 | 120 | 4.3 | 05OCT2019 |
4 | 4 | Historical Facts | Mary Johnson | History | 34.50 | 80 | 4.6 | 12JAN2021 |
5 | 5 | Sci-Fi Adventures | Luke Skywalker | Science Fiction | 25.00 | 300 | 4.8 | 30JUL2017 |
/*Step 2: Exploring the Dataset*/
/*To understand the structure and contents of our dataset, we can use the PROC CONTENTS procedure:*/
proc contents data=bookstore_sales;
title "Dataset Structure for Bookstore Sales";
run;
Output:
Dataset Structure for Bookstore
Sales |
Data Set Name | WORK.BOOKSTORE_SALES | Observations | 5 |
---|---|---|---|
Member Type | DATA | Variables | 8 |
Engine | V9 | Indexes | 0 |
Created | 14/09/2015 00:52:48 | Observation Length | 144 |
Last Modified | 14/09/2015 00:52:48 | Deleted Observations | 0 |
Protection | Compressed | NO | |
Data Set Type | Sorted | NO | |
Label | |||
Data Representation | WINDOWS_64 | ||
Encoding | wlatin1 Western (Windows) |
Engine/Host Dependent Information | |
---|---|
Data Set Page Size | 65536 |
Number of Data Set Pages | 1 |
First Data Page | 1 |
Max Obs per Page | 454 |
Obs in First Data Page | 5 |
Number of Data Set Repairs | 0 |
ExtendObsCounter | YES |
Filename | C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD10152_DESKTOP-QFAA4KV_\bookstore_sales.sas7bdat |
Release Created | 9.0401M2 |
Host Created | X64_8HOME |
Alphabetic List of Variables and Attributes | ||||
---|---|---|---|---|
# | Variable | Type | Len | Format |
3 | Author | Char | 30 | |
1 | BookID | Num | 8 | |
6 | CopiesSold | Num | 8 | |
4 | Genre | Char | 20 | |
5 | Price | Num | 8 | |
8 | PublicationDate | Num | 8 | DATE9. |
7 | Rating | Num | 8 | |
2 | Title | Char | 50 |
/*Step 3: Data Manipulation Techniques*/
/*Adding New Variables*/
/*Suppose we want to calculate the total revenue for each book and classify the sales performance. We can add two new variables: TotalRevenue and SalesCategory.*/
data bookstore_sales;
set bookstore_sales;
TotalRevenue = Price * CopiesSold;
if CopiesSold > 250 then SalesCategory = 'Best Seller';
else if CopiesSold between 100 and 250 then SalesCategory = 'Average Seller';
else SalesCategory = 'Low Seller';
run;
proc print data=bookstore_sales;
run;
Output:
Obs | BookID | Title | Author | Genre | Price | CopiesSold | Rating | PublicationDate | TotalRevenue | SalesCategory |
---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | The Art of SAS | John Doe | Technology | 49.99 | 150 | 4.5 | 15MAR2020 | 7498.5 | Average Seller |
2 | 2 | Cooking 101 | Jane Smith | Cooking | 29.95 | 200 | 4.7 | 22JUN2018 | 5990.0 | Average Seller |
3 | 3 | Mystery Manor | Alan Poe | Mystery | 19.99 | 120 | 4.3 | 05OCT2019 | 2398.8 | Average Seller |
4 | 4 | Historical Facts | Mary Johnson | History | 34.50 | 80 | 4.6 | 12JAN2021 | 2760.0 | Low Seller |
5 | 5 | Sci-Fi Adventures | Luke Skywalker | Science Fiction | 25.00 | 300 | 4.8 | 30JUL2017 | 7500.0 | Best Seller |
/*Sorting Data*/
/*To sort the books by TotalRevenue in descending order:*/
proc sort data=bookstore_sales;
by descending TotalRevenue;
run;
proc print data=bookstore_sales;
run;
Output:
Obs | BookID | Title | Author | Genre | Price | CopiesSold | Rating | PublicationDate | TotalRevenue | SalesCategory |
---|---|---|---|---|---|---|---|---|---|---|
1 | 5 | Sci-Fi Adventures | Luke Skywalker | Science Fiction | 25.00 | 300 | 4.8 | 30JUL2017 | 7500.0 | Best Seller |
2 | 1 | The Art of SAS | John Doe | Technology | 49.99 | 150 | 4.5 | 15MAR2020 | 7498.5 | Average Seller |
3 | 2 | Cooking 101 | Jane Smith | Cooking | 29.95 | 200 | 4.7 | 22JUN2018 | 5990.0 | Average Seller |
4 | 4 | Historical Facts | Mary Johnson | History | 34.50 | 80 | 4.6 | 12JAN2021 | 2760.0 | Low Seller |
5 | 3 | Mystery Manor | Alan Poe | Mystery | 19.99 | 120 | 4.3 | 05OCT2019 | 2398.8 | Average Seller |
/*Step 4: Aggregating Data*/
/*To analyze the average rating and total copies sold per genre, we can use PROC MEANS:*/
proc means data=bookstore_sales ;
class Genre;
var Rating CopiesSold;
output out=genre_summary mean(Rating)=AvgRating sum(CopiesSold)=TotalCopiesSold;
run;
Output:
Genre | N Obs | Variable | N | Mean | Std Dev | Minimum | Maximum | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Cooking | 1 |
|
|
|
|
|
| ||||||||||||
History | 1 |
|
|
|
|
|
| ||||||||||||
Mystery | 1 |
|
|
|
|
|
| ||||||||||||
Science Fiction | 1 |
|
|
|
|
|
| ||||||||||||
Technology | 1 |
|
|
|
|
|
|
/*Step 5: Data Visualization*/
/*Visualizing data can provide insights that are not immediately apparent from tables. SAS offers various procedures for creating plots and charts.*/
/*Bar Chart of Total Revenue by Genre*/
proc sgplot data=bookstore_sales;
vbar Genre / response=TotalRevenue stat=sum;
title "Total Revenue by Genre";
run;
/*Scatter Plot of Rating vs. Price*/
proc sgplot data=bookstore_sales;
scatter x=Price y=Rating / datalabel=Title;
title "Customer Rating vs. Book Price";
run;
/*Step 6: Advanced Data Techniques*/
/*Merging Datasets*/
/*If we have another dataset containing information about the publishers, we can merge it with our bookstore_sales dataset.*/
/*Assuming we have a dataset publishers:*/
data publishers;
input PublisherID PublisherName $;
datalines;
1 "Tech Books Publishing"
2 "Culinary Press"
3 "Mystery House"
4 "Historical Publications"
5 "Sci-Fi World"
;
run;
proc print;
run;
Output:
Obs | PublisherID | PublisherName |
---|---|---|
1 | 1 | "Tech |
2 | 2 | "Culinar |
3 | 3 | "Mystery |
4 | 4 | "Histori |
5 | 5 | "Sci-Fi |
data book_publishers;
input BookID PublisherID;
datalines;
1 1
2 2
3 3
4 4
5 5
;
run;
proc print;
run;
Output:
Obs | BookID | PublisherID |
---|---|---|
1 | 1 | 1 |
2 | 2 | 2 |
3 | 3 | 3 |
4 | 4 | 4 |
5 | 5 | 5 |
/*We can merge these datasets to include publisher information in our sales data:*/
proc sort data=bookstore_sales;
by BookID;
run;
proc print;
run;
Output:
Obs | BookID | Title | Author | Genre | Price | CopiesSold | Rating | PublicationDate | TotalRevenue | SalesCategory |
---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | The Art of SAS | John Doe | Technology | 49.99 | 150 | 4.5 | 15MAR2020 | 7498.5 | Average Seller |
2 | 2 | Cooking 101 | Jane Smith | Cooking | 29.95 | 200 | 4.7 | 22JUN2018 | 5990.0 | Average Seller |
3 | 3 | Mystery Manor | Alan Poe | Mystery | 19.99 | 120 | 4.3 | 05OCT2019 | 2398.8 | Average Seller |
4 | 4 | Historical Facts | Mary Johnson | History | 34.50 | 80 | 4.6 | 12JAN2021 | 2760.0 | Low Seller |
5 | 5 | Sci-Fi Adventures | Luke Skywalker | Science Fiction | 25.00 | 300 | 4.8 | 30JUL2017 | 7500.0 | Best Seller |
proc sort data=book_publishers;
by BookID;
run;
proc print;
run;
Output:
Obs | BookID | PublisherID |
---|---|---|
1 | 1 | 1 |
2 | 2 | 2 |
3 | 3 | 3 |
4 | 4 | 4 |
5 | 5 | 5 |
data bookstore_sales_with_publishers;
merge bookstore_sales(in=a) book_publishers(in=b);
by BookID;
if a and b;
run;
proc print;
run;
Output:
Obs | BookID | Title | Author | Genre | Price | CopiesSold | Rating | PublicationDate | TotalRevenue | SalesCategory | PublisherID |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | The Art of SAS | John Doe | Technology | 49.99 | 150 | 4.5 | 15MAR2020 | 7498.5 | Average Seller | 1 |
2 | 2 | Cooking 101 | Jane Smith | Cooking | 29.95 | 200 | 4.7 | 22JUN2018 | 5990.0 | Average Seller | 2 |
3 | 3 | Mystery Manor | Alan Poe | Mystery | 19.99 | 120 | 4.3 | 05OCT2019 | 2398.8 | Average Seller | 3 |
4 | 4 | Historical Facts | Mary Johnson | History | 34.50 | 80 | 4.6 | 12JAN2021 | 2760.0 | Low Seller | 4 |
5 | 5 | Sci-Fi Adventures | Luke Skywalker | Science Fiction | 25.00 | 300 | 4.8 | 30JUL2017 | 7500.0 | Best Seller | 5 |
proc sort data=bookstore_sales_with_publishers;
by PublisherID;
run;
proc print;
run;
Output:
Obs | BookID | Title | Author | Genre | Price | CopiesSold | Rating | PublicationDate | TotalRevenue | SalesCategory | PublisherID |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | The Art of SAS | John Doe | Technology | 49.99 | 150 | 4.5 | 15MAR2020 | 7498.5 | Average Seller | 1 |
2 | 2 | Cooking 101 | Jane Smith | Cooking | 29.95 | 200 | 4.7 | 22JUN2018 | 5990.0 | Average Seller | 2 |
3 | 3 | Mystery Manor | Alan Poe | Mystery | 19.99 | 120 | 4.3 | 05OCT2019 | 2398.8 | Average Seller | 3 |
4 | 4 | Historical Facts | Mary Johnson | History | 34.50 | 80 | 4.6 | 12JAN2021 | 2760.0 | Low Seller | 4 |
5 | 5 | Sci-Fi Adventures | Luke Skywalker | Science Fiction | 25.00 | 300 | 4.8 | 30JUL2017 | 7500.0 | Best Seller | 5 |
data final_dataset;
merge bookstore_sales_with_publishers(in=a) publishers(in=b);
by PublisherID;
if a and b;
run;
proc print;
run;
Output:
Obs | BookID | Title | Author | Genre | Price | CopiesSold | Rating | PublicationDate | TotalRevenue | SalesCategory | PublisherID | PublisherName |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | The Art of SAS | John Doe | Technology | 49.99 | 150 | 4.5 | 15MAR2020 | 7498.5 | Average Seller | 1 | "Tech |
2 | 2 | Cooking 101 | Jane Smith | Cooking | 29.95 | 200 | 4.7 | 22JUN2018 | 5990.0 | Average Seller | 2 | "Culinar |
3 | 3 | Mystery Manor | Alan Poe | Mystery | 19.99 | 120 | 4.3 | 05OCT2019 | 2398.8 | Average Seller | 3 | "Mystery |
4 | 4 | Historical Facts | Mary Johnson | History | 34.50 | 80 | 4.6 | 12JAN2021 | 2760.0 | Low Seller | 4 | "Histori |
5 | 5 | Sci-Fi Adventures | Luke Skywalker | Science Fiction | 25.00 | 300 | 4.8 | 30JUL2017 | 7500.0 | Best Seller | 5 | "Sci-Fi |
- Get link
- X
- Other Apps
Comments
Post a Comment