193.EXPLORING A COMPREHENSIVE COOLERS DATASET USING PROC SQL | PROC PRINT | PROC SORT | PROC FREQ | PROC MEANS | PROC REPORT | PROC FORMAT | PROC SGPLOT | PROC MACRO | PROC TRANSPOSE | PROC UNIVARIATE | PROC CONTENTS IN SAS
- Get link
- X
- Other Apps
EXPLORING A COMPREHENSIVE COOLERS DATASET USING PROC SQL | PROC PRINT | PROC SORT | PROC FREQ | PROC MEANS | PROC REPORT | PROC FORMAT | PROC SGPLOT | PROC MACRO | PROC TRANSPOSE | PROC UNIVARIATE | PROC CONTENTS IN SAS
/*A dataset of coolers and analyze it using different SAS procedures*/
Step 1:Dataset consists of variables like
Cooler_ID: Unique identifier
Cooler_Type: Type of cooler
Brand: Cooler brand
Cooling_Capacity: In liters/hour
Power_Consumption: In Watts
Price: In INR
Rating: Customer rating (1 to 5 stars)
Warranty_Years: Years of warranty
Stock_Status: Availability in stock
Step 2: Creating the Dataset
data coolers;
input Cooler_ID $ Cooler_Type: $15. Brand $ Cooling_Capacity Power_Consumption Price Rating Warranty_Years Stock_Status $;
datalines;
C001 AirCooler Symphony 30 180 6500 4.2 1 Yes
C002 WaterCooler BlueStar 60 420 14500 4.7 2 Yes
C003 TowerCooler Orient 40 200 7200 4.1 1 Yes
C004 DesertCooler Bajaj 80 500 12500 4.5 2 No
C005 MiniCooler Cello 20 120 4800 3.8 1 Yes
C006 AirCooler Kenstar 35 210 6900 4.0 1 No
C007 WaterCooler Voltas 70 450 15000 4.6 3 Yes
C008 TowerCooler Havells 45 220 7800 4.0 1 Yes
C009 DesertCooler Symphony 85 550 13200 4.3 2 Yes
C010 MiniCooler Bajaj 15 100 4300 3.5 1 No
C011 AirCooler Crompton 32 190 6600 4.1 1 Yes
C012 WaterCooler LG 65 400 14800 4.8 2 Yes
C013 TowerCooler Kenstar 50 230 8000 4.2 1 Yes
C014 DesertCooler Orient 90 520 13500 4.4 2 No
C015 MiniCooler BlueStar 18 110 4600 3.7 1 Yes
;
run;
proc print;run;
Output:
Obs | Cooler_ID | Cooler_Type | Brand | Cooling_Capacity | Power_Consumption | Price | Rating | Warranty_Years | Stock_Status |
---|---|---|---|---|---|---|---|---|---|
1 | C001 | AirCooler | Symphony | 30 | 180 | 6500 | 4.2 | 1 | Yes |
2 | C002 | WaterCooler | BlueStar | 60 | 420 | 14500 | 4.7 | 2 | Yes |
3 | C003 | TowerCooler | Orient | 40 | 200 | 7200 | 4.1 | 1 | Yes |
4 | C004 | DesertCooler | Bajaj | 80 | 500 | 12500 | 4.5 | 2 | No |
5 | C005 | MiniCooler | Cello | 20 | 120 | 4800 | 3.8 | 1 | Yes |
6 | C006 | AirCooler | Kenstar | 35 | 210 | 6900 | 4.0 | 1 | No |
7 | C007 | WaterCooler | Voltas | 70 | 450 | 15000 | 4.6 | 3 | Yes |
8 | C008 | TowerCooler | Havells | 45 | 220 | 7800 | 4.0 | 1 | Yes |
9 | C009 | DesertCooler | Symphony | 85 | 550 | 13200 | 4.3 | 2 | Yes |
10 | C010 | MiniCooler | Bajaj | 15 | 100 | 4300 | 3.5 | 1 | No |
11 | C011 | AirCooler | Crompton | 32 | 190 | 6600 | 4.1 | 1 | Yes |
12 | C012 | WaterCooler | LG | 65 | 400 | 14800 | 4.8 | 2 | Yes |
13 | C013 | TowerCooler | Kenstar | 50 | 230 | 8000 | 4.2 | 1 | Yes |
14 | C014 | DesertCooler | Orient | 90 | 520 | 13500 | 4.4 | 2 | No |
15 | C015 | MiniCooler | BlueStar | 18 | 110 | 4600 | 3.7 | 1 | Yes |
Step 3: Exploring the Dataset
PROC PRINT: Basic View
proc print data=coolers;
title "Full Cooler Dataset";
run;
Output:
Full Cooler Dataset |
Obs | Cooler_ID | Cooler_Type | Brand | Cooling_Capacity | Power_Consumption | Price | Rating | Warranty_Years | Stock_Status |
---|---|---|---|---|---|---|---|---|---|
1 | C001 | AirCooler | Symphony | 30 | 180 | 6500 | 4.2 | 1 | Yes |
2 | C002 | WaterCooler | BlueStar | 60 | 420 | 14500 | 4.7 | 2 | Yes |
3 | C003 | TowerCooler | Orient | 40 | 200 | 7200 | 4.1 | 1 | Yes |
4 | C004 | DesertCooler | Bajaj | 80 | 500 | 12500 | 4.5 | 2 | No |
5 | C005 | MiniCooler | Cello | 20 | 120 | 4800 | 3.8 | 1 | Yes |
6 | C006 | AirCooler | Kenstar | 35 | 210 | 6900 | 4.0 | 1 | No |
7 | C007 | WaterCooler | Voltas | 70 | 450 | 15000 | 4.6 | 3 | Yes |
8 | C008 | TowerCooler | Havells | 45 | 220 | 7800 | 4.0 | 1 | Yes |
9 | C009 | DesertCooler | Symphony | 85 | 550 | 13200 | 4.3 | 2 | Yes |
10 | C010 | MiniCooler | Bajaj | 15 | 100 | 4300 | 3.5 | 1 | No |
11 | C011 | AirCooler | Crompton | 32 | 190 | 6600 | 4.1 | 1 | Yes |
12 | C012 | WaterCooler | LG | 65 | 400 | 14800 | 4.8 | 2 | Yes |
13 | C013 | TowerCooler | Kenstar | 50 | 230 | 8000 | 4.2 | 1 | Yes |
14 | C014 | DesertCooler | Orient | 90 | 520 | 13500 | 4.4 | 2 | No |
15 | C015 | MiniCooler | BlueStar | 18 | 110 | 4600 | 3.7 | 1 | Yes |
PROC CONTENTS: Metadata of Dataset
proc contents data=coolers;
title "Cooler Dataset Structure";
run;
Output:
Cooler Dataset Structure |
Data Set Name | WORK.COOLERS | Observations | 15 |
---|---|---|---|
Member Type | DATA | Variables | 9 |
Engine | V9 | Indexes | 0 |
Created | 14/09/2015 00:34:33 | Observation Length | 72 |
Last Modified | 14/09/2015 00:34:33 | Deleted Observations | 0 |
Protection | Compressed | NO | |
Data Set Type | Sorted | NO | |
Label | |||
Data Representation | WINDOWS_64 | ||
Encoding | wlatin1 Western (Windows) |
Engine/Host Dependent Information | |
---|---|
Data Set Page Size | 65536 |
Number of Data Set Pages | 1 |
First Data Page | 1 |
Max Obs per Page | 908 |
Obs in First Data Page | 15 |
Number of Data Set Repairs | 0 |
ExtendObsCounter | YES |
Filename | C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD5308_DESKTOP-QFAA4KV_\coolers.sas7bdat |
Release Created | 9.0401M2 |
Host Created | X64_8HOME |
Alphabetic List of Variables and Attributes | |||
---|---|---|---|
# | Variable | Type | Len |
3 | Brand | Char | 8 |
1 | Cooler_ID | Char | 8 |
2 | Cooler_Type | Char | 8 |
4 | Cooling_Capacity | Num | 8 |
5 | Power_Consumption | Num | 8 |
6 | Price | Num | 8 |
7 | Rating | Num | 8 |
9 | Stock_Status | Char | 8 |
8 | Warranty_Years | Num | 8 |
Step 4: Descriptive Statistics
PROC MEANS: Numeric Summary
proc means data=coolers n mean min max std;
var Cooling_Capacity Power_Consumption Price Rating;
title "Descriptive Statistics for Numeric Variables";
run;
Output:
Descriptive Statistics for Numeric Variables |
Variable | N | Mean | Minimum | Maximum | Std Dev | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
PROC UNIVARIATE: Detailed Stats
proc univariate data=coolers;
var Price;
title "Distribution of Cooler Prices";
run;
Output:
Distribution of Cooler Prices |
Moments | |||
---|---|---|---|
N | 15 | Sum Weights | 15 |
Mean | 9346.66667 | Sum Observations | 140200 |
Std Deviation | 4048.08009 | Variance | 16386952.4 |
Skewness | 0.2932028 | Kurtosis | -1.7229701 |
Uncorrected SS | 1539820000 | Corrected SS | 229417333 |
Coeff Variation | 43.3104146 | Std Error Mean | 1045.20978 |
Basic Statistical Measures | |||
---|---|---|---|
Location | Variability | ||
Mean | 9346.667 | Std Deviation | 4048 |
Median | 7800.000 | Variance | 16386952 |
Mode | . | Range | 10700 |
Interquartile Range | 7000 |
Tests for Location: Mu0=0 | ||||
---|---|---|---|---|
Test | Statistic | p Value | ||
Student's t | t | 8.942383 | Pr > |t| | <.0001 |
Sign | M | 7.5 | Pr >= |M| | <.0001 |
Signed Rank | S | 60 | Pr >= |S| | <.0001 |
Quantiles (Definition 5) | |
---|---|
Level | Quantile |
100% Max | 15000 |
99% | 15000 |
95% | 15000 |
90% | 14800 |
75% Q3 | 13500 |
50% Median | 7800 |
25% Q1 | 6500 |
10% | 4600 |
5% | 4300 |
1% | 4300 |
0% Min | 4300 |
Extreme Observations | |||
---|---|---|---|
Lowest | Highest | ||
Value | Obs | Value | Obs |
4300 | 10 | 13200 | 9 |
4600 | 15 | 13500 | 14 |
4800 | 5 | 14500 | 2 |
6500 | 1 | 14800 | 12 |
6600 | 11 | 15000 | 7 |
Step 5: Frequency Analysis
PROC FREQ: Cooler Types and Brands
proc freq data=coolers;
tables Cooler_Type Brand Stock_Status;
title "Frequency Tables for Categorical Variables";
run;
Output:
Frequency Tables for Categorical Variables |
Cooler_Type | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
AirCooler | 3 | 20.00 | 3 | 20.00 |
DesertCooler | 3 | 20.00 | 6 | 40.00 |
MiniCooler | 3 | 20.00 | 9 | 60.00 |
TowerCooler | 3 | 20.00 | 12 | 80.00 |
WaterCooler | 3 | 20.00 | 15 | 100.00 |
Brand | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
Bajaj | 2 | 13.33 | 2 | 13.33 |
BlueStar | 2 | 13.33 | 4 | 26.67 |
Cello | 1 | 6.67 | 5 | 33.33 |
Crompton | 1 | 6.67 | 6 | 40.00 |
Havells | 1 | 6.67 | 7 | 46.67 |
Kenstar | 2 | 13.33 | 9 | 60.00 |
LG | 1 | 6.67 | 10 | 66.67 |
Orient | 2 | 13.33 | 12 | 80.00 |
Symphony | 2 | 13.33 | 14 | 93.33 |
Voltas | 1 | 6.67 | 15 | 100.00 |
Stock_Status | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
No | 4 | 26.67 | 4 | 26.67 |
Yes | 11 | 73.33 | 15 | 100.00 |
Step 6: Custom Reports
PROC REPORT: Summarize by Cooler Type
proc report data=coolers nowd;
column Cooler_Type Cooling_Capacity Price;
define Cooler_Type / group;
define Cooling_Capacity / analysis mean;
define Price / analysis mean;
title "Average Cooling Capacity and Price by Cooler Type";
run;
Output:
Average Cooling Capacity and Price by Cooler Type |
Cooler_Type | Cooling_Capacity | Price |
---|---|---|
AirCooler | 32.333333 | 6666.6667 |
DesertCooler | 85 | 13066.667 |
MiniCooler | 17.666667 | 4566.6667 |
TowerCooler | 45 | 7666.6667 |
WaterCooler | 65 | 14766.667 |
Step 7: Sorting and Filtering
PROC SORT: By Price Descending
proc sort data=coolers out=sorted_price;
by descending Price;
run;
proc print data=sorted_price;
title "Coolers Sorted by Price (Descending)";
run;
Output:
Coolers Sorted by Price (Descending) |
Obs | Cooler_ID | Cooler_Type | Brand | Cooling_Capacity | Power_Consumption | Price | Rating | Warranty_Years | Stock_Status |
---|---|---|---|---|---|---|---|---|---|
1 | C007 | WaterCooler | Voltas | 70 | 450 | 15000 | 4.6 | 3 | Yes |
2 | C012 | WaterCooler | LG | 65 | 400 | 14800 | 4.8 | 2 | Yes |
3 | C002 | WaterCooler | BlueStar | 60 | 420 | 14500 | 4.7 | 2 | Yes |
4 | C014 | DesertCooler | Orient | 90 | 520 | 13500 | 4.4 | 2 | No |
5 | C009 | DesertCooler | Symphony | 85 | 550 | 13200 | 4.3 | 2 | Yes |
6 | C004 | DesertCooler | Bajaj | 80 | 500 | 12500 | 4.5 | 2 | No |
7 | C013 | TowerCooler | Kenstar | 50 | 230 | 8000 | 4.2 | 1 | Yes |
8 | C008 | TowerCooler | Havells | 45 | 220 | 7800 | 4.0 | 1 | Yes |
9 | C003 | TowerCooler | Orient | 40 | 200 | 7200 | 4.1 | 1 | Yes |
10 | C006 | AirCooler | Kenstar | 35 | 210 | 6900 | 4.0 | 1 | No |
11 | C011 | AirCooler | Crompton | 32 | 190 | 6600 | 4.1 | 1 | Yes |
12 | C001 | AirCooler | Symphony | 30 | 180 | 6500 | 4.2 | 1 | Yes |
13 | C005 | MiniCooler | Cello | 20 | 120 | 4800 | 3.8 | 1 | Yes |
14 | C015 | MiniCooler | BlueStar | 18 | 110 | 4600 | 3.7 | 1 | Yes |
15 | C010 | MiniCooler | Bajaj | 15 | 100 | 4300 | 3.5 | 1 | No |
PROC SQL: Top 5 Expensive Coolers
proc sql outobs=5;
create table top_expensive as
select * from coolers
order by Price desc;
quit;
proc print data=top_expensive;
title "Top 5 Most Expensive Coolers";
run;
Output:
Top 5 Most Expensive Coolers |
Obs | Cooler_ID | Cooler_Type | Brand | Cooling_Capacity | Power_Consumption | Price | Rating | Warranty_Years | Stock_Status |
---|---|---|---|---|---|---|---|---|---|
1 | C007 | WaterCooler | Voltas | 70 | 450 | 15000 | 4.6 | 3 | Yes |
2 | C012 | WaterCooler | LG | 65 | 400 | 14800 | 4.8 | 2 | Yes |
3 | C002 | WaterCooler | BlueStar | 60 | 420 | 14500 | 4.7 | 2 | Yes |
4 | C014 | DesertCooler | Orient | 90 | 520 | 13500 | 4.4 | 2 | No |
5 | C009 | DesertCooler | Symphony | 85 | 550 | 13200 | 4.3 | 2 | Yes |
Step 8: Advanced SQL Analysis
SQL: Average Price and Power by Cooler Type
proc sql;
select Cooler_Type,
avg(Price) as Avg_Price,
avg(Power_Consumption) as Avg_Power
from coolers
group by Cooler_Type;
quit;
Output:
Cooler_Type | Avg_Price | Avg_Power |
---|---|---|
AirCooler | 6666.667 | 193.3333 |
DesertCooler | 13066.67 | 523.3333 |
MiniCooler | 4566.667 | 110 |
TowerCooler | 7666.667 | 216.6667 |
WaterCooler | 14766.67 | 423.3333 |
SQL: Count of In-Stock Coolers by Brand
proc sql;
select Brand, count(*) as InStock_Count
from coolers
where Stock_Status = 'Yes'
group by Brand;
quit;
Output:
Brand | InStock_Count |
---|---|
BlueStar | 2 |
Cello | 1 |
Crompton | 1 |
Havells | 1 |
Kenstar | 1 |
LG | 1 |
Orient | 1 |
Symphony | 2 |
Voltas | 1 |
Step 9: Creating a Macro to Filter Coolers
Macro: Coolers by Rating and Type
%macro filter_coolers(rating=, type=);
proc sql;
select * from coolers
where Rating >= &rating and Cooler_Type = "&type";
quit;
%mend;
%filter_coolers(rating=4.0, type=AirCooler)
Output:
Cooler_ID | Cooler_Type | Brand | Cooling_Capacity | Power_Consumption | Price | Rating | Warranty_Years | Stock_Status |
---|---|---|---|---|---|---|---|---|
C001 | AirCooler | Symphony | 30 | 180 | 6500 | 4.2 | 1 | Yes |
C006 | AirCooler | Kenstar | 35 | 210 | 6900 | 4 | 1 | No |
C011 | AirCooler | Crompton | 32 | 190 | 6600 | 4.1 | 1 | Yes |
Step 10: Creating Macro for Summary Reports
Macro: Summarize Any Numeric Variable
%macro summary(var=);
proc means data=coolers n mean min max std;
var &var;
title "Summary Statistics for &var";
run;
%mend;
%summary(var=Power_Consumption)
Output:
Summary Statistics for Power_Consumption |
Analysis Variable : Power_Consumption | ||||
---|---|---|---|---|
N | Mean | Minimum | Maximum | Std Dev |
15 | 293.3333333 | 100.0000000 | 550.0000000 | 160.6978828 |
%summary(var=Price)
Output:
Summary Statistics for Price |
Analysis Variable : Price | ||||
---|---|---|---|---|
N | Mean | Minimum | Maximum | Std Dev |
15 | 9346.67 | 4300.00 | 15000.00 | 4048.08 |
Step 11: Visualization with PROC SGPLOT
proc sgplot data=coolers;
vbar Cooler_Type / response=Price stat=mean;
title "Average Price by Cooler Type";
run;
Log:
proc sgplot data=coolers;
scatter x=Cooling_Capacity y=Price / group=Cooler_Type;
title "Cooling Capacity vs Price";
run;
Log:
Step 12: Creating Derived Variables
New Variable: Efficiency Score
data coolers_updated;
set coolers;
Efficiency = Cooling_Capacity / Power_Consumption;
run;
proc print data=coolers_updated;
title "Coolers with Efficiency Score";
run;
Output:
Coolers with Efficiency Score |
Obs | Cooler_ID | Cooler_Type | Brand | Cooling_Capacity | Power_Consumption | Price | Rating | Warranty_Years | Stock_Status | Efficiency |
---|---|---|---|---|---|---|---|---|---|---|
1 | C001 | AirCooler | Symphony | 30 | 180 | 6500 | 4.2 | 1 | Yes | 0.16667 |
2 | C002 | WaterCooler | BlueStar | 60 | 420 | 14500 | 4.7 | 2 | Yes | 0.14286 |
3 | C003 | TowerCooler | Orient | 40 | 200 | 7200 | 4.1 | 1 | Yes | 0.20000 |
4 | C004 | DesertCooler | Bajaj | 80 | 500 | 12500 | 4.5 | 2 | No | 0.16000 |
5 | C005 | MiniCooler | Cello | 20 | 120 | 4800 | 3.8 | 1 | Yes | 0.16667 |
6 | C006 | AirCooler | Kenstar | 35 | 210 | 6900 | 4.0 | 1 | No | 0.16667 |
7 | C007 | WaterCooler | Voltas | 70 | 450 | 15000 | 4.6 | 3 | Yes | 0.15556 |
8 | C008 | TowerCooler | Havells | 45 | 220 | 7800 | 4.0 | 1 | Yes | 0.20455 |
9 | C009 | DesertCooler | Symphony | 85 | 550 | 13200 | 4.3 | 2 | Yes | 0.15455 |
10 | C010 | MiniCooler | Bajaj | 15 | 100 | 4300 | 3.5 | 1 | No | 0.15000 |
11 | C011 | AirCooler | Crompton | 32 | 190 | 6600 | 4.1 | 1 | Yes | 0.16842 |
12 | C012 | WaterCooler | LG | 65 | 400 | 14800 | 4.8 | 2 | Yes | 0.16250 |
13 | C013 | TowerCooler | Kenstar | 50 | 230 | 8000 | 4.2 | 1 | Yes | 0.21739 |
14 | C014 | DesertCooler | Orient | 90 | 520 | 13500 | 4.4 | 2 | No | 0.17308 |
15 | C015 | MiniCooler | BlueStar | 18 | 110 | 4600 | 3.7 | 1 | Yes | 0.16364 |
Step 13: Ranking Coolers by Rating
proc rank data=coolers out=ranked_coolers descending ties=low;
var Rating;
ranks Rank_Rating;
run;
proc print data=ranked_coolers;
title "Coolers Ranked by Rating";
run;
Output:
Coolers Ranked by Rating |
Obs | Cooler_ID | Cooler_Type | Brand | Cooling_Capacity | Power_Consumption | Price | Rating | Warranty_Years | Stock_Status | Rank_Rating |
---|---|---|---|---|---|---|---|---|---|---|
1 | C001 | AirCooler | Symphony | 30 | 180 | 6500 | 4.2 | 1 | Yes | 7 |
2 | C002 | WaterCooler | BlueStar | 60 | 420 | 14500 | 4.7 | 2 | Yes | 2 |
3 | C003 | TowerCooler | Orient | 40 | 200 | 7200 | 4.1 | 1 | Yes | 9 |
4 | C004 | DesertCooler | Bajaj | 80 | 500 | 12500 | 4.5 | 2 | No | 4 |
5 | C005 | MiniCooler | Cello | 20 | 120 | 4800 | 3.8 | 1 | Yes | 13 |
6 | C006 | AirCooler | Kenstar | 35 | 210 | 6900 | 4.0 | 1 | No | 11 |
7 | C007 | WaterCooler | Voltas | 70 | 450 | 15000 | 4.6 | 3 | Yes | 3 |
8 | C008 | TowerCooler | Havells | 45 | 220 | 7800 | 4.0 | 1 | Yes | 11 |
9 | C009 | DesertCooler | Symphony | 85 | 550 | 13200 | 4.3 | 2 | Yes | 6 |
10 | C010 | MiniCooler | Bajaj | 15 | 100 | 4300 | 3.5 | 1 | No | 15 |
11 | C011 | AirCooler | Crompton | 32 | 190 | 6600 | 4.1 | 1 | Yes | 9 |
12 | C012 | WaterCooler | LG | 65 | 400 | 14800 | 4.8 | 2 | Yes | 1 |
13 | C013 | TowerCooler | Kenstar | 50 | 230 | 8000 | 4.2 | 1 | Yes | 7 |
14 | C014 | DesertCooler | Orient | 90 | 520 | 13500 | 4.4 | 2 | No | 5 |
15 | C015 | MiniCooler | BlueStar | 18 | 110 | 4600 | 3.7 | 1 | Yes | 14 |
- Get link
- X
- Other Apps
Comments
Post a Comment