193.EXPLORING A COMPREHENSIVE COOLERS DATASET USING PROC SQL | PROC PRINT | PROC SORT | PROC FREQ | PROC MEANS | PROC REPORT | PROC FORMAT | PROC SGPLOT | PROC MACRO | PROC TRANSPOSE | PROC UNIVARIATE | PROC CONTENTS IN SAS

EXPLORING A COMPREHENSIVE COOLERS DATASET USING PROC SQL | PROC PRINT | PROC SORT | PROC FREQ | PROC MEANS | PROC REPORT | PROC FORMAT | PROC SGPLOT | PROC MACRO | PROC TRANSPOSE | PROC UNIVARIATE | PROC CONTENTS IN SAS

 /*A dataset of coolers and analyze it using different SAS procedures*/

Step 1:Dataset consists of variables like

Cooler_ID: Unique identifier

Cooler_Type: Type of cooler

Brand: Cooler brand

Cooling_Capacity: In liters/hour

Power_Consumption: In Watts

Price: In INR

Rating: Customer rating (1 to 5 stars)

Warranty_Years: Years of warranty

Stock_Status: Availability in stock


Step 2: Creating the Dataset

data coolers;

  input Cooler_ID $ Cooler_Type: $15. Brand $ Cooling_Capacity Power_Consumption Price Rating Warranty_Years Stock_Status $;

  datalines;

C001 AirCooler Symphony 30 180 6500 4.2 1 Yes

C002 WaterCooler BlueStar 60 420 14500 4.7 2 Yes

C003 TowerCooler Orient 40 200 7200 4.1 1 Yes

C004 DesertCooler Bajaj 80 500 12500 4.5 2 No

C005 MiniCooler Cello 20 120 4800 3.8 1 Yes

C006 AirCooler Kenstar 35 210 6900 4.0 1 No

C007 WaterCooler Voltas 70 450 15000 4.6 3 Yes

C008 TowerCooler Havells 45 220 7800 4.0 1 Yes

C009 DesertCooler Symphony 85 550 13200 4.3 2 Yes

C010 MiniCooler Bajaj 15 100 4300 3.5 1 No

C011 AirCooler Crompton 32 190 6600 4.1 1 Yes

C012 WaterCooler LG 65 400 14800 4.8 2 Yes

C013 TowerCooler Kenstar 50 230 8000 4.2 1 Yes

C014 DesertCooler Orient 90 520 13500 4.4 2 No

C015 MiniCooler BlueStar 18 110 4600 3.7 1 Yes

;

run;

proc print;run;

Output:

Obs Cooler_ID Cooler_Type Brand Cooling_Capacity Power_Consumption Price Rating Warranty_Years Stock_Status
1 C001 AirCooler Symphony 30 180 6500 4.2 1 Yes
2 C002 WaterCooler BlueStar 60 420 14500 4.7 2 Yes
3 C003 TowerCooler Orient 40 200 7200 4.1 1 Yes
4 C004 DesertCooler Bajaj 80 500 12500 4.5 2 No
5 C005 MiniCooler Cello 20 120 4800 3.8 1 Yes
6 C006 AirCooler Kenstar 35 210 6900 4.0 1 No
7 C007 WaterCooler Voltas 70 450 15000 4.6 3 Yes
8 C008 TowerCooler Havells 45 220 7800 4.0 1 Yes
9 C009 DesertCooler Symphony 85 550 13200 4.3 2 Yes
10 C010 MiniCooler Bajaj 15 100 4300 3.5 1 No
11 C011 AirCooler Crompton 32 190 6600 4.1 1 Yes
12 C012 WaterCooler LG 65 400 14800 4.8 2 Yes
13 C013 TowerCooler Kenstar 50 230 8000 4.2 1 Yes
14 C014 DesertCooler Orient 90 520 13500 4.4 2 No
15 C015 MiniCooler BlueStar 18 110 4600 3.7 1 Yes


Step 3: Exploring the Dataset

PROC PRINT: Basic View

proc print data=coolers;

  title "Full Cooler Dataset";

run;

Output:

Full Cooler Dataset

Obs Cooler_ID Cooler_Type Brand Cooling_Capacity Power_Consumption Price Rating Warranty_Years Stock_Status
1 C001 AirCooler Symphony 30 180 6500 4.2 1 Yes
2 C002 WaterCooler BlueStar 60 420 14500 4.7 2 Yes
3 C003 TowerCooler Orient 40 200 7200 4.1 1 Yes
4 C004 DesertCooler Bajaj 80 500 12500 4.5 2 No
5 C005 MiniCooler Cello 20 120 4800 3.8 1 Yes
6 C006 AirCooler Kenstar 35 210 6900 4.0 1 No
7 C007 WaterCooler Voltas 70 450 15000 4.6 3 Yes
8 C008 TowerCooler Havells 45 220 7800 4.0 1 Yes
9 C009 DesertCooler Symphony 85 550 13200 4.3 2 Yes
10 C010 MiniCooler Bajaj 15 100 4300 3.5 1 No
11 C011 AirCooler Crompton 32 190 6600 4.1 1 Yes
12 C012 WaterCooler LG 65 400 14800 4.8 2 Yes
13 C013 TowerCooler Kenstar 50 230 8000 4.2 1 Yes
14 C014 DesertCooler Orient 90 520 13500 4.4 2 No
15 C015 MiniCooler BlueStar 18 110 4600 3.7 1 Yes

PROC CONTENTS: Metadata of Dataset

proc contents data=coolers;

  title "Cooler Dataset Structure";

run;

Output:

Cooler Dataset Structure

The CONTENTS Procedure

Data Set Name WORK.COOLERS Observations 15
Member Type DATA Variables 9
Engine V9 Indexes 0
Created 14/09/2015 00:34:33 Observation Length 72
Last Modified 14/09/2015 00:34:33 Deleted Observations 0
Protection   Compressed NO
Data Set Type   Sorted NO
Label      
Data Representation WINDOWS_64    
Encoding wlatin1 Western (Windows)    

Engine/Host Dependent Information
Data Set Page Size 65536
Number of Data Set Pages 1
First Data Page 1
Max Obs per Page 908
Obs in First Data Page 15
Number of Data Set Repairs 0
ExtendObsCounter YES
Filename C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD5308_DESKTOP-QFAA4KV_\coolers.sas7bdat
Release Created 9.0401M2
Host Created X64_8HOME

Alphabetic List of Variables and Attributes
# Variable Type Len
3 Brand Char 8
1 Cooler_ID Char 8
2 Cooler_Type Char 8
4 Cooling_Capacity Num 8
5 Power_Consumption Num 8
6 Price Num 8
7 Rating Num 8
9 Stock_Status Char 8
8 Warranty_Years Num 8

Step 4: Descriptive Statistics

PROC MEANS: Numeric Summary

proc means data=coolers n mean min max std;

  var Cooling_Capacity Power_Consumption Price Rating;

  title "Descriptive Statistics for Numeric Variables";

run;

Output:

Descriptive Statistics for Numeric Variables

The MEANS Procedure

Variable N Mean Minimum Maximum Std Dev
Cooling_Capacity
Power_Consumption
Price
Rating
15
15
15
15
49.0000000
293.3333333
9346.67
4.1933333
15.0000000
100.0000000
4300.00
3.5000000
90.0000000
550.0000000
15000.00
4.8000000
24.8682241
160.6978828
4048.08
0.3692979


PROC UNIVARIATE: Detailed Stats

proc univariate data=coolers;

  var Price;

  title "Distribution of Cooler Prices";

run;

Output:

Distribution of Cooler Prices

The UNIVARIATE Procedure
Variable: Price

Moments
N 15 Sum Weights 15
Mean 9346.66667 Sum Observations 140200
Std Deviation 4048.08009 Variance 16386952.4
Skewness 0.2932028 Kurtosis -1.7229701
Uncorrected SS 1539820000 Corrected SS 229417333
Coeff Variation 43.3104146 Std Error Mean 1045.20978

Basic Statistical Measures
Location Variability
Mean 9346.667 Std Deviation 4048
Median 7800.000 Variance 16386952
Mode . Range 10700
    Interquartile Range 7000

Tests for Location: Mu0=0
Test Statistic p Value
Student's t t 8.942383 Pr > |t| <.0001
Sign M 7.5 Pr >= |M| <.0001
Signed Rank S 60 Pr >= |S| <.0001

Quantiles (Definition 5)
Level Quantile
100% Max 15000
99% 15000
95% 15000
90% 14800
75% Q3 13500
50% Median 7800
25% Q1 6500
10% 4600
5% 4300
1% 4300
0% Min 4300

Extreme Observations
Lowest Highest
Value Obs Value Obs
4300 10 13200 9
4600 15 13500 14
4800 5 14500 2
6500 1 14800 12
6600 11 15000 7

Step 5: Frequency Analysis

PROC FREQ: Cooler Types and Brands

proc freq data=coolers;

  tables Cooler_Type Brand Stock_Status;

  title "Frequency Tables for Categorical Variables";

run;

Output:

Frequency Tables for Categorical Variables

The FREQ Procedure

Cooler_Type Frequency Percent Cumulative
Frequency
Cumulative
Percent
AirCooler 3 20.00 3 20.00
DesertCooler 3 20.00 6 40.00
MiniCooler 3 20.00 9 60.00
TowerCooler 3 20.00 12 80.00
WaterCooler 3 20.00 15 100.00

Brand Frequency Percent Cumulative
Frequency
Cumulative
Percent
Bajaj 2 13.33 2 13.33
BlueStar 2 13.33 4 26.67
Cello 1 6.67 5 33.33
Crompton 1 6.67 6 40.00
Havells 1 6.67 7 46.67
Kenstar 2 13.33 9 60.00
LG 1 6.67 10 66.67
Orient 2 13.33 12 80.00
Symphony 2 13.33 14 93.33
Voltas 1 6.67 15 100.00

Stock_Status Frequency Percent Cumulative
Frequency
Cumulative
Percent
No 4 26.67 4 26.67
Yes 11 73.33 15 100.00


Step 6: Custom Reports

PROC REPORT: Summarize by Cooler Type

proc report data=coolers nowd;

  column Cooler_Type Cooling_Capacity Price;

  define Cooler_Type / group;

  define Cooling_Capacity / analysis mean;

  define Price / analysis mean;

  title "Average Cooling Capacity and Price by Cooler Type";

run;

Output:

Average Cooling Capacity and Price by Cooler Type

Cooler_Type Cooling_Capacity Price
AirCooler 32.333333 6666.6667
DesertCooler 85 13066.667
MiniCooler 17.666667 4566.6667
TowerCooler 45 7666.6667
WaterCooler 65 14766.667


Step 7: Sorting and Filtering

PROC SORT: By Price Descending

proc sort data=coolers out=sorted_price;

  by descending Price;

run;


proc print data=sorted_price;

  title "Coolers Sorted by Price (Descending)";

run;

Output:

Coolers Sorted by Price (Descending)

Obs Cooler_ID Cooler_Type Brand Cooling_Capacity Power_Consumption Price Rating Warranty_Years Stock_Status
1 C007 WaterCooler Voltas 70 450 15000 4.6 3 Yes
2 C012 WaterCooler LG 65 400 14800 4.8 2 Yes
3 C002 WaterCooler BlueStar 60 420 14500 4.7 2 Yes
4 C014 DesertCooler Orient 90 520 13500 4.4 2 No
5 C009 DesertCooler Symphony 85 550 13200 4.3 2 Yes
6 C004 DesertCooler Bajaj 80 500 12500 4.5 2 No
7 C013 TowerCooler Kenstar 50 230 8000 4.2 1 Yes
8 C008 TowerCooler Havells 45 220 7800 4.0 1 Yes
9 C003 TowerCooler Orient 40 200 7200 4.1 1 Yes
10 C006 AirCooler Kenstar 35 210 6900 4.0 1 No
11 C011 AirCooler Crompton 32 190 6600 4.1 1 Yes
12 C001 AirCooler Symphony 30 180 6500 4.2 1 Yes
13 C005 MiniCooler Cello 20 120 4800 3.8 1 Yes
14 C015 MiniCooler BlueStar 18 110 4600 3.7 1 Yes
15 C010 MiniCooler Bajaj 15 100 4300 3.5 1 No


PROC SQL: Top 5 Expensive Coolers

proc sql outobs=5;

  create table top_expensive as

  select * from coolers

  order by Price desc;

quit;


proc print data=top_expensive;

  title "Top 5 Most Expensive Coolers";

run;

Output:

Top 5 Most Expensive Coolers

Obs Cooler_ID Cooler_Type Brand Cooling_Capacity Power_Consumption Price Rating Warranty_Years Stock_Status
1 C007 WaterCooler Voltas 70 450 15000 4.6 3 Yes
2 C012 WaterCooler LG 65 400 14800 4.8 2 Yes
3 C002 WaterCooler BlueStar 60 420 14500 4.7 2 Yes
4 C014 DesertCooler Orient 90 520 13500 4.4 2 No
5 C009 DesertCooler Symphony 85 550 13200 4.3 2 Yes


Step 8: Advanced SQL Analysis

SQL: Average Price and Power by Cooler Type

proc sql;

  select Cooler_Type,

         avg(Price) as Avg_Price,

         avg(Power_Consumption) as Avg_Power

  from coolers

  group by Cooler_Type;

quit;

Output:

Cooler_Type Avg_Price Avg_Power
AirCooler 6666.667 193.3333
DesertCooler 13066.67 523.3333
MiniCooler 4566.667 110
TowerCooler 7666.667 216.6667
WaterCooler 14766.67 423.3333


SQL: Count of In-Stock Coolers by Brand

proc sql;

  select Brand, count(*) as InStock_Count

  from coolers

  where Stock_Status = 'Yes'

  group by Brand;

quit;

Output:

Brand InStock_Count
BlueStar 2
Cello 1
Crompton 1
Havells 1
Kenstar 1
LG 1
Orient 1
Symphony 2
Voltas 1


Step 9: Creating a Macro to Filter Coolers

Macro: Coolers by Rating and Type

%macro filter_coolers(rating=, type=);

  proc sql;

    select * from coolers

    where Rating >= &rating and Cooler_Type = "&type";

  quit;

%mend;


%filter_coolers(rating=4.0, type=AirCooler)

Output:

Cooler_ID Cooler_Type Brand Cooling_Capacity Power_Consumption Price Rating Warranty_Years Stock_Status
C001 AirCooler Symphony 30 180 6500 4.2 1 Yes
C006 AirCooler Kenstar 35 210 6900 4 1 No
C011 AirCooler Crompton 32 190 6600 4.1 1 Yes

Step 10: Creating Macro for Summary Reports

Macro: Summarize Any Numeric Variable

%macro summary(var=);

  proc means data=coolers n mean min max std;

    var &var;

    title "Summary Statistics for &var";

  run;

%mend;


%summary(var=Power_Consumption)

Output:

Summary Statistics for Power_Consumption

The MEANS Procedure

Analysis Variable : Power_Consumption
N Mean Minimum Maximum Std Dev
15 293.3333333 100.0000000 550.0000000 160.6978828

%summary(var=Price)

Output:

Summary Statistics for Price

The MEANS Procedure

Analysis Variable : Price
N Mean Minimum Maximum Std Dev
15 9346.67 4300.00 15000.00 4048.08

Step 11: Visualization with PROC SGPLOT

proc sgplot data=coolers;

  vbar Cooler_Type / response=Price stat=mean;

  title "Average Price by Cooler Type";

run;

Log:

NOTE: PROCEDURE SGPLOT used (Total process time):
      real time           3.65 seconds
      cpu time            0.62 seconds

NOTE: Listing image output written to SGPlot1.png.
NOTE: There were 15 observations read from the data set WORK.COOLERS.

proc sgplot data=coolers;

  scatter x=Cooling_Capacity y=Price / group=Cooler_Type;

  title "Cooling Capacity vs Price";

run;

Log:

NOTE: PROCEDURE SGPLOT used (Total process time):
      real time           0.78 seconds
      cpu time            0.03 seconds

NOTE: Listing image output written to SGPlot3.png.
NOTE: There were 15 observations read from the data set WORK.COOLERS.

Step 12: Creating Derived Variables

New Variable: Efficiency Score

data coolers_updated;

  set coolers;

  Efficiency = Cooling_Capacity / Power_Consumption;

run;

proc print data=coolers_updated;

  title "Coolers with Efficiency Score";

run;

Output:

Coolers with Efficiency Score

Obs Cooler_ID Cooler_Type Brand Cooling_Capacity Power_Consumption Price Rating Warranty_Years Stock_Status Efficiency
1 C001 AirCooler Symphony 30 180 6500 4.2 1 Yes 0.16667
2 C002 WaterCooler BlueStar 60 420 14500 4.7 2 Yes 0.14286
3 C003 TowerCooler Orient 40 200 7200 4.1 1 Yes 0.20000
4 C004 DesertCooler Bajaj 80 500 12500 4.5 2 No 0.16000
5 C005 MiniCooler Cello 20 120 4800 3.8 1 Yes 0.16667
6 C006 AirCooler Kenstar 35 210 6900 4.0 1 No 0.16667
7 C007 WaterCooler Voltas 70 450 15000 4.6 3 Yes 0.15556
8 C008 TowerCooler Havells 45 220 7800 4.0 1 Yes 0.20455
9 C009 DesertCooler Symphony 85 550 13200 4.3 2 Yes 0.15455
10 C010 MiniCooler Bajaj 15 100 4300 3.5 1 No 0.15000
11 C011 AirCooler Crompton 32 190 6600 4.1 1 Yes 0.16842
12 C012 WaterCooler LG 65 400 14800 4.8 2 Yes 0.16250
13 C013 TowerCooler Kenstar 50 230 8000 4.2 1 Yes 0.21739
14 C014 DesertCooler Orient 90 520 13500 4.4 2 No 0.17308
15 C015 MiniCooler BlueStar 18 110 4600 3.7 1 Yes 0.16364


Step 13: Ranking Coolers by Rating

proc rank data=coolers out=ranked_coolers descending ties=low;

  var Rating;

  ranks Rank_Rating;

run;

proc print data=ranked_coolers;

  title "Coolers Ranked by Rating";

run;

Output:

Coolers Ranked by Rating

Obs Cooler_ID Cooler_Type Brand Cooling_Capacity Power_Consumption Price Rating Warranty_Years Stock_Status Rank_Rating
1 C001 AirCooler Symphony 30 180 6500 4.2 1 Yes 7
2 C002 WaterCooler BlueStar 60 420 14500 4.7 2 Yes 2
3 C003 TowerCooler Orient 40 200 7200 4.1 1 Yes 9
4 C004 DesertCooler Bajaj 80 500 12500 4.5 2 No 4
5 C005 MiniCooler Cello 20 120 4800 3.8 1 Yes 13
6 C006 AirCooler Kenstar 35 210 6900 4.0 1 No 11
7 C007 WaterCooler Voltas 70 450 15000 4.6 3 Yes 3
8 C008 TowerCooler Havells 45 220 7800 4.0 1 Yes 11
9 C009 DesertCooler Symphony 85 550 13200 4.3 2 Yes 6
10 C010 MiniCooler Bajaj 15 100 4300 3.5 1 No 15
11 C011 AirCooler Crompton 32 190 6600 4.1 1 Yes 9
12 C012 WaterCooler LG 65 400 14800 4.8 2 Yes 1
13 C013 TowerCooler Kenstar 50 230 8000 4.2 1 Yes 7
14 C014 DesertCooler Orient 90 520 13500 4.4 2 No 5
15 C015 MiniCooler BlueStar 18 110 4600 3.7 1 Yes 14


PRACTICE AND COMMENT YOUR OUTPUT: 

--->PLEASE FOLLOW OUR BLOG FOR MORE UPDATES.

PLEASE FOLLOW OUR TELEGRAM CHANNEL CLICK HERE

PLEASE FOLLOW OUR FACEBOOK PAGE  CLICK HERE

PLEASE FOLLOW OUR INSTAGRAM PAGE CLICK HERE




Comments