161.ULTIMATE GUIDE TO MASTERING BOOKSTORE SALES DATA ANALYSIS USING SAS | DATA | PROC CONTENTS | PROC PRINT | PROC MEANS | PROC FREQ | PROC SGPLOT | PROC SORT | PROC REPORT | PROC CORR | PROC REG | PROC EXPORT | DATA CLEANING TECHNIQUES
ULTIMATE GUIDE TO MASTERING BOOKSTORE SALES DATA ANALYSIS USING SAS | DATA | PROC CONTENTS | PROC PRINT | PROC MEANS | PROC FREQ | PROC SGPLOT | PROC SORT | PROC REPORT | PROC CORR | PROC REG | PROC EXPORT | DATA CLEANING TECHNIQUES
/*Creating a unique dataset centered around bookstore sales and demonstrating various SAS procedures using this dataset.*/
data bookstore_sales;
length Date 8 Branch $20 Genre $20 Title $50 Author $30 Units_Sold 8 Revenue 8;
format Date yymmdd10.;
input Date : yymmdd10. Branch : $20. Genre : $20. Title & $50. Author & $30. Units_Sold Revenue;
datalines;
2025-04-01 Hyderabad Fiction The Silent Patient Alex Michaelides 5 1500
2025-04-01 Mumbai Non-Fiction Sapiens Yuval Noah Harari 3 1800
2025-04-01 Delhi Science A Brief History of Time Stephen Hawking 2 1000
2025-04-02 Hyderabad Fiction The Midnight Library Matt Haig 4 1200
2025-04-02 Mumbai Fiction The Alchemist Paulo Coelho 6 1800
;
run;
proc print;run;
Output:
| Obs | Date | Branch | Genre | Title | Author | Units_Sold | Revenue |
|---|---|---|---|---|---|---|---|
| 1 | 2025-04-01 | Hyderabad | Fiction | The Silent Patient | Alex Michaelides | 5 | 1500 |
| 2 | 2025-04-01 | Mumbai | Non-Fiction | Sapiens | Yuval Noah Harari | 3 | 1800 |
| 3 | 2025-04-01 | Delhi | Science | A Brief History of Time | Stephen Hawking | 2 | 1000 |
| 4 | 2025-04-02 | Hyderabad | Fiction | The Midnight Library | Matt Haig | 4 | 1200 |
| 5 | 2025-04-02 | Mumbai | Fiction | The Alchemist | Paulo Coelho | 6 | 1800 |
proc contents data=bookstore_sales;
run;
Output:
| Data Set Name | WORK.BOOKSTORE_SALES | Observations | 5 |
|---|---|---|---|
| Member Type | DATA | Variables | 7 |
| Engine | V9 | Indexes | 0 |
| Created | 14/09/2015 00:05:59 | Observation Length | 144 |
| Last Modified | 14/09/2015 00:05:59 | Deleted Observations | 0 |
| Protection | Compressed | NO | |
| Data Set Type | Sorted | NO | |
| Label | |||
| Data Representation | WINDOWS_64 | ||
| Encoding | wlatin1 Western (Windows) |
| Engine/Host Dependent Information | |
|---|---|
| Data Set Page Size | 65536 |
| Number of Data Set Pages | 1 |
| First Data Page | 1 |
| Max Obs per Page | 454 |
| Obs in First Data Page | 5 |
| Number of Data Set Repairs | 0 |
| ExtendObsCounter | YES |
| Filename | C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD8676_DESKTOP-QFAA4KV_\bookstore_sales.sas7bdat |
| Release Created | 9.0401M2 |
| Host Created | X64_8HOME |
| Alphabetic List of Variables and Attributes | ||||
|---|---|---|---|---|
| # | Variable | Type | Len | Format |
| 5 | Author | Char | 30 | |
| 2 | Branch | Char | 20 | |
| 1 | Date | Num | 8 | YYMMDD10. |
| 3 | Genre | Char | 20 | |
| 7 | Revenue | Num | 8 | |
| 4 | Title | Char | 50 | |
| 6 | Units_Sold | Num | 8 | |
proc print data=bookstore_sales;
title "Bookstore Sales Data";
run;
Output:
| Bookstore Sales Data |
| Obs | Date | Branch | Genre | Title | Author | Units_Sold | Revenue |
|---|---|---|---|---|---|---|---|
| 1 | 2025-04-01 | Hyderabad | Fiction | The Silent Patient | Alex Michaelides | 5 | 1500 |
| 2 | 2025-04-01 | Mumbai | Non-Fiction | Sapiens | Yuval Noah Harari | 3 | 1800 |
| 3 | 2025-04-01 | Delhi | Science | A Brief History of Time | Stephen Hawking | 2 | 1000 |
| 4 | 2025-04-02 | Hyderabad | Fiction | The Midnight Library | Matt Haig | 4 | 1200 |
| 5 | 2025-04-02 | Mumbai | Fiction | The Alchemist | Paulo Coelho | 6 | 1800 |
proc means data=bookstore_sales mean median min max std;
var Units_Sold Revenue;
run;
Output:
| Variable | Mean | Median | Minimum | Maximum | Std Dev | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
proc freq data=bookstore_sales;
tables Genre Branch;
run;
Output:
| Genre | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
|---|---|---|---|---|
| Fiction | 3 | 60.00 | 3 | 60.00 |
| Non-Fiction | 1 | 20.00 | 4 | 80.00 |
| Science | 1 | 20.00 | 5 | 100.00 |
| Branch | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
|---|---|---|---|---|
| Delhi | 1 | 20.00 | 1 | 20.00 |
| Hyderabad | 2 | 40.00 | 3 | 60.00 |
| Mumbai | 2 | 40.00 | 5 | 100.00 |
proc sgplot data=bookstore_sales;
vbar Genre / response=Revenue stat=sum;
title "Total Revenue by Genre";
run;
Log:
NOTE: PROCEDURE SGPLOT used (Total process time):
real time 2.96 seconds
cpu time 0.62 seconds
NOTE: Listing image output written to SGPlot1.png.
NOTE: There were 5 observations read from the data set WORK.BOOKSTORE_SALES.
proc sgplot data=bookstore_sales;
series x=Date y=Revenue / markers;
title "Daily Revenue Trend";
run;
Log:
NOTE: PROCEDURE SGPLOT used (Total process time):
real time 0.73 seconds
cpu time 0.04 seconds
NOTE: The column format YYMMDD10 is replaced by an auto-generated format on the axis.
NOTE: The column format YYMMDD10 is replaced by an auto-generated format on the axis.
NOTE: Listing image output written to SGPlot3.png.
NOTE: There were 5 observations read from the data set WORK.BOOKSTORE_SALES.
data bookstore_sales;
set bookstore_sales;
Avg_Price = Revenue / Units_Sold;
run;
proc print;run;
Output:
| Daily Revenue Trend |
| Obs | Date | Branch | Genre | Title | Author | Units_Sold | Revenue | Avg_Price |
|---|---|---|---|---|---|---|---|---|
| 1 | 2025-04-01 | Hyderabad | Fiction | The Silent Patient | Alex Michaelides | 5 | 1500 | 300 |
| 2 | 2025-04-01 | Mumbai | Non-Fiction | Sapiens | Yuval Noah Harari | 3 | 1800 | 600 |
| 3 | 2025-04-01 | Delhi | Science | A Brief History of Time | Stephen Hawking | 2 | 1000 | 500 |
| 4 | 2025-04-02 | Hyderabad | Fiction | The Midnight Library | Matt Haig | 4 | 1200 | 300 |
| 5 | 2025-04-02 | Mumbai | Fiction | The Alchemist | Paulo Coelho | 6 | 1800 | 300 |
proc sort data=bookstore_sales;
by descending Revenue;
run;
proc print;run;
Output:
| Obs | Date | Branch | Genre | Title | Author | Units_Sold | Revenue | Avg_Price |
|---|---|---|---|---|---|---|---|---|
| 1 | 2025-04-01 | Mumbai | Non-Fiction | Sapiens | Yuval Noah Harari | 3 | 1800 | 600 |
| 2 | 2025-04-02 | Mumbai | Fiction | The Alchemist | Paulo Coelho | 6 | 1800 | 300 |
| 3 | 2025-04-01 | Hyderabad | Fiction | The Silent Patient | Alex Michaelides | 5 | 1500 | 300 |
| 4 | 2025-04-02 | Hyderabad | Fiction | The Midnight Library | Matt Haig | 4 | 1200 | 300 |
| 5 | 2025-04-01 | Delhi | Science | A Brief History of Time | Stephen Hawking | 2 | 1000 | 500 |
proc report data=bookstore_sales nowd;
column Branch Units_Sold Revenue;
define Branch / group;
define Units_Sold / analysis sum;
define Revenue / analysis sum;
title "Total Units Sold and Revenue by Branch";
run;
Output:
| Total Units Sold and Revenue by
Branch |
| Branch | Units_Sold | Revenue |
|---|---|---|
| Delhi | 2 | 1000 |
| Hyderabad | 9 | 2700 |
| Mumbai | 9 | 3600 |
proc corr data=bookstore_sales;
var Units_Sold Revenue;
run;
Output:
| 2 Variables: | Units_Sold Revenue |
|---|
| Simple Statistics | ||||||
|---|---|---|---|---|---|---|
| Variable | N | Mean | Std Dev | Sum | Minimum | Maximum |
| Units_Sold | 5 | 4.00000 | 1.58114 | 20.00000 | 2.00000 | 6.00000 |
| Revenue | 5 | 1460 | 357.77088 | 7300 | 1000 | 1800 |
| Pearson Correlation
Coefficients, N = 5 Prob > |r| under H0: Rho=0 | ||||||
|---|---|---|---|---|---|---|
| Units_Sold | Revenue | |||||
| Units_Sold |
|
| ||||
| Revenue |
|
| ||||
proc reg data=bookstore_sales;
model Revenue = Units_Sold;
title "Regression Analysis: Revenue vs. Units Sold";
run;
Output:
| Regression Analysis: Revenue vs. Units
Sold |
| Number of Observations Read | 5 |
|---|---|
| Number of Observations Used | 5 |
| Analysis of Variance | |||||
|---|---|---|---|---|---|
| Source | DF | Sum of Squares |
Mean Square |
F Value | Pr > F |
| Model | 1 | 169000 | 169000 | 1.48 | 0.3110 |
| Error | 3 | 343000 | 114333 | ||
| Corrected Total | 4 | 512000 | |||
| Root MSE | 338.13212 | R-Square | 0.3301 |
|---|---|---|---|
| Dependent Mean | 1460.00000 | Adj R-Sq | 0.1068 |
| Coeff Var | 23.15973 |
| Parameter Estimates | |||||
|---|---|---|---|---|---|
| Variable | DF | Parameter Estimate |
Standard Error |
t Value | Pr > |t| |
| Intercept | 1 | 940.00000 | 453.65185 | 2.07 | 0.1300 |
| Units_Sold | 1 | 130.00000 | 106.92677 | 1.22 | 0.3110 |
proc export data=bookstore_sales
outfile="bookstore_sales.csv"
dbms=csv
replace;
run;
Comments
Post a Comment