197.ANALYZING DIFFERENT TYPES OF ALCOHOLS USING PROC PRINT | PROC CONTENTS | PROC MEANS | PROC FREQ | PROC SGPLOT | PROC SQL | PROC TABULATE | PROC SORT | PROC RANK | PROC EXPORT | PROC DATA IN SAS
- Get link
- X
- Other Apps
ANALYZING DIFFERENT TYPES OF ALCOHOLS USING PROC PRINT | PROC CONTENTS | PROC MEANS | PROC FREQ | PROC SGPLOT | PROC SQL | PROC TABULATE | PROC SORT | PROC RANK | PROC EXPORT | PROC DATA IN SAS
/*Creating a dataset of different types of alcohol*/
Step 1: Create the Dataset
data alcohol_data;
length Brand $20 Category $10 Country $15;
input AlcoholID Brand $ Category $ Alcohol_Content Price(in dollars) Country $ Rating;
datalines;
1 Kingfisher Beer 4.8 2.5 India 7.5
2 Budweiser Beer 5.0 3.0 USA 8.0
3 Heineken Beer 5.0 3.5 Netherlands 8.5
4 Chivas Whiskey 40.0 45.0 Scotland 9.2
5 JackDaniel Whiskey 40.0 50.0 USA 9.5
6 Absolut Vodka 40.0 35.0 Sweden 8.8
7 Smirnoff Vodka 37.5 30.0 UK 8.2
8 GreyGoose Vodka 40.0 60.0 France 9.0
9 Sula Wine 12.0 10.0 India 7.0
10 YellowTail Wine 13.0 12.0 Australia 7.8
11 Barefoot Wine 12.5 11.0 USA 7.9
12 Bacardi Rum 35.0 25.0 PuertoRico 8.5
13 OldMonk Rum 42.8 15.0 India 8.0
14 HavanaClub Rum 40.0 20.0 Cuba 8.6
15 JimBeam Whiskey 40.0 38.0 USA 9.0
16 Carlsberg Beer 5.0 3.0 Denmark 7.8
17 RedLabel Whiskey 40.0 33.0 Scotland 8.7
;
run;
proc print;run;
Output:
Obs | Brand | Category | Country | AlcoholID | Alcohol_Content | Price | Rating |
---|---|---|---|---|---|---|---|
1 | Kingfisher | Beer | India | 1 | 4.8 | 2.5 | 7.5 |
2 | Budweiser | Beer | USA | 2 | 5.0 | 3.0 | 8.0 |
3 | Heineken | Beer | Netherlands | 3 | 5.0 | 3.5 | 8.5 |
4 | Chivas | Whiskey | Scotland | 4 | 40.0 | 45.0 | 9.2 |
5 | JackDaniel | Whiskey | USA | 5 | 40.0 | 50.0 | 9.5 |
6 | Absolut | Vodka | Sweden | 6 | 40.0 | 35.0 | 8.8 |
7 | Smirnoff | Vodka | UK | 7 | 37.5 | 30.0 | 8.2 |
8 | GreyGoose | Vodka | France | 8 | 40.0 | 60.0 | 9.0 |
9 | Sula | Wine | India | 9 | 12.0 | 10.0 | 7.0 |
10 | YellowTail | Wine | Australia | 10 | 13.0 | 12.0 | 7.8 |
11 | Barefoot | Wine | USA | 11 | 12.5 | 11.0 | 7.9 |
12 | Bacardi | Rum | PuertoRico | 12 | 35.0 | 25.0 | 8.5 |
13 | OldMonk | Rum | India | 13 | 42.8 | 15.0 | 8.0 |
14 | HavanaClub | Rum | Cuba | 14 | 40.0 | 20.0 | 8.6 |
15 | JimBeam | Whiskey | USA | 15 | 40.0 | 38.0 | 9.0 |
16 | Carlsberg | Beer | Denmark | 16 | 5.0 | 3.0 | 7.8 |
17 | RedLabel | Whiskey | Scotland | 17 | 40.0 | 33.0 | 8.7 |
Step 2: Exploring the Dataset with PROC CONTENTS and PROC PRINT
proc contents data=alcohol_data;
title "Structure of Alcohol Dataset";
run;
Output:
Structure of Alcohol Dataset |
Data Set Name | WORK.ALCOHOL_DATA | Observations | 17 |
---|---|---|---|
Member Type | DATA | Variables | 7 |
Engine | V9 | Indexes | 0 |
Created | 14/09/2015 00:01:55 | Observation Length | 80 |
Last Modified | 14/09/2015 00:01:55 | Deleted Observations | 0 |
Protection | Compressed | NO | |
Data Set Type | Sorted | NO | |
Label | |||
Data Representation | WINDOWS_64 | ||
Encoding | wlatin1 Western (Windows) |
Engine/Host Dependent Information | |
---|---|
Data Set Page Size | 65536 |
Number of Data Set Pages | 1 |
First Data Page | 1 |
Max Obs per Page | 817 |
Obs in First Data Page | 17 |
Number of Data Set Repairs | 0 |
ExtendObsCounter | YES |
Filename | C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD13872_DESKTOP-QFAA4KV_\alcohol_data.sas7bdat |
Release Created | 9.0401M2 |
Host Created | X64_8HOME |
Alphabetic List of Variables and Attributes | |||
---|---|---|---|
# | Variable | Type | Len |
4 | AlcoholID | Num | 8 |
5 | Alcohol_Content | Num | 8 |
1 | Brand | Char | 20 |
2 | Category | Char | 10 |
3 | Country | Char | 15 |
6 | Price | Num | 8 |
7 | Rating | Num | 8 |
proc print data=alcohol_data;
title "Raw Alcohol Data";
run;
Output:
Raw Alcohol Data |
Obs | Brand | Category | Country | AlcoholID | Alcohol_Content | Price | Rating |
---|---|---|---|---|---|---|---|
1 | Kingfisher | Beer | India | 1 | 4.8 | 2.5 | 7.5 |
2 | Budweiser | Beer | USA | 2 | 5.0 | 3.0 | 8.0 |
3 | Heineken | Beer | Netherlands | 3 | 5.0 | 3.5 | 8.5 |
4 | Chivas | Whiskey | Scotland | 4 | 40.0 | 45.0 | 9.2 |
5 | JackDaniel | Whiskey | USA | 5 | 40.0 | 50.0 | 9.5 |
6 | Absolut | Vodka | Sweden | 6 | 40.0 | 35.0 | 8.8 |
7 | Smirnoff | Vodka | UK | 7 | 37.5 | 30.0 | 8.2 |
8 | GreyGoose | Vodka | France | 8 | 40.0 | 60.0 | 9.0 |
9 | Sula | Wine | India | 9 | 12.0 | 10.0 | 7.0 |
10 | YellowTail | Wine | Australia | 10 | 13.0 | 12.0 | 7.8 |
11 | Barefoot | Wine | USA | 11 | 12.5 | 11.0 | 7.9 |
12 | Bacardi | Rum | PuertoRico | 12 | 35.0 | 25.0 | 8.5 |
13 | OldMonk | Rum | India | 13 | 42.8 | 15.0 | 8.0 |
14 | HavanaClub | Rum | Cuba | 14 | 40.0 | 20.0 | 8.6 |
15 | JimBeam | Whiskey | USA | 15 | 40.0 | 38.0 | 9.0 |
16 | Carlsberg | Beer | Denmark | 16 | 5.0 | 3.0 | 7.8 |
17 | RedLabel | Whiskey | Scotland | 17 | 40.0 | 33.0 | 8.7 |
Step 3: Summarizing Data with PROC MEANS and PROC FREQ
proc means data=alcohol_data mean min max stddev;
var Alcohol_Content Price Rating;
title "Summary Statistics for Alcohol Content, Price, and Rating";
run;
Output:
Summary Statistics for Alcohol Content, Price, and Rating |
Variable | Mean | Minimum | Maximum | Std Dev | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
proc freq data=alcohol_data;
tables Category Country;
title "Frequency of Categories and Countries";
run;
Output:
Frequency of Categories and Countries |
Category | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
Beer | 4 | 23.53 | 4 | 23.53 |
Rum | 3 | 17.65 | 7 | 41.18 |
Vodka | 3 | 17.65 | 10 | 58.82 |
Whiskey | 4 | 23.53 | 14 | 82.35 |
Wine | 3 | 17.65 | 17 | 100.00 |
Country | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
Australia | 1 | 5.88 | 1 | 5.88 |
Cuba | 1 | 5.88 | 2 | 11.76 |
Denmark | 1 | 5.88 | 3 | 17.65 |
France | 1 | 5.88 | 4 | 23.53 |
India | 3 | 17.65 | 7 | 41.18 |
Netherlands | 1 | 5.88 | 8 | 47.06 |
PuertoRico | 1 | 5.88 | 9 | 52.94 |
Scotland | 2 | 11.76 | 11 | 64.71 |
Sweden | 1 | 5.88 | 12 | 70.59 |
UK | 1 | 5.88 | 13 | 76.47 |
USA | 4 | 23.53 | 17 | 100.00 |
Step 4: Visual Analysis with PROC SGPLOT
proc sgplot data=alcohol_data;
vbar Category / response=Price stat=mean;
title "Average Price by Alcohol Category";
run;
Log:
proc sgplot data=alcohol_data;
scatter x=Alcohol_Content y=Price / group=Category;
title "Price vs Alcohol Content by Category";
run;
Log:
Step 5: Using PROC SQL for Complex Queries
proc sql;
/* Top-rated alcohols */
select Brand, Category, Rating
from alcohol_data
where Rating > 8.5
order by Rating desc;
Output:
Brand | Category | Rating |
---|---|---|
JackDaniel | Whiskey | 9.5 |
Chivas | Whiskey | 9.2 |
JimBeam | Whiskey | 9 |
GreyGoose | Vodka | 9 |
Absolut | Vodka | 8.8 |
RedLabel | Whiskey | 8.7 |
HavanaClub | Rum | 8.6 |
/* Average price per country */
select Country, avg(Price) as Avg_Price
from alcohol_data
group by Country;
Output:
Country | Avg_Price |
---|---|
Australia | 12 |
Cuba | 20 |
Denmark | 3 |
France | 60 |
India | 9.166667 |
Netherlands | 3.5 |
PuertoRico | 25 |
Scotland | 39 |
Sweden | 35 |
UK | 30 |
USA | 25.5 |
/* Count of alcohols by category */
select Category, count(*) as Count
from alcohol_data
group by Category;
quit;
Output:
Category | Count |
---|---|
Beer | 4 |
Rum | 3 |
Vodka | 3 |
Whiskey | 4 |
Wine | 3 |
Step 6: Creating a Macro to Filter by Category
%macro filter_by_category(cat);
proc print data=alcohol_data;
where Category="&cat";
title "Alcohols in Category: &cat";
run;
%mend;
%filter_by_category(Beer)
Output:
Alcohols in Category: Beer |
Obs | Brand | Category | Country | AlcoholID | Alcohol_Content | Price | Rating |
---|---|---|---|---|---|---|---|
1 | Kingfisher | Beer | India | 1 | 4.8 | 2.5 | 7.5 |
2 | Budweiser | Beer | USA | 2 | 5.0 | 3.0 | 8.0 |
3 | Heineken | Beer | Netherlands | 3 | 5.0 | 3.5 | 8.5 |
16 | Carlsberg | Beer | Denmark | 16 | 5.0 | 3.0 | 7.8 |
%filter_by_category(Whiskey)
Output:
Alcohols in Category: Whiskey |
Obs | Brand | Category | Country | AlcoholID | Alcohol_Content | Price | Rating |
---|---|---|---|---|---|---|---|
4 | Chivas | Whiskey | Scotland | 4 | 40 | 45 | 9.2 |
5 | JackDaniel | Whiskey | USA | 5 | 40 | 50 | 9.5 |
15 | JimBeam | Whiskey | USA | 15 | 40 | 38 | 9.0 |
17 | RedLabel | Whiskey | Scotland | 17 | 40 | 33 | 8.7 |
Step 7: Macro for Summary Statistics by Country
%macro summary_by_country(country);
proc means data=alcohol_data mean std maxdec=2;
where Country = "&country";
var Price Alcohol_Content Rating;
title "Summary for Country: &country";
run;
%mend;
%summary_by_country(India)
Output:
Summary for Country: India |
Variable | Mean | Std Dev | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
%summary_by_country(USA)
Output:
Summary for Country: USA |
Variable | Mean | Std Dev | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
Step 8: Creating Derived Variables
data alcohol_extended;
set alcohol_data;
length PriceCategory $10.;
if Price < 10 then PriceCategory = "Low";
else if Price < 30 then PriceCategory = "Medium";
else PriceCategory = "High";
HighRated = (Rating >= 8.5);
run;
proc print data=alcohol_extended;
title "Extended Alcohol Dataset with Derived Variables";
run;
Output:
Extended Alcohol Dataset with Derived Variables |
Obs | Brand | Category | Country | AlcoholID | Alcohol_Content | Price | Rating | PriceCategory | HighRated |
---|---|---|---|---|---|---|---|---|---|
1 | Kingfisher | Beer | India | 1 | 4.8 | 2.5 | 7.5 | Low | 0 |
2 | Budweiser | Beer | USA | 2 | 5.0 | 3.0 | 8.0 | Low | 0 |
3 | Heineken | Beer | Netherlands | 3 | 5.0 | 3.5 | 8.5 | Low | 1 |
4 | Chivas | Whiskey | Scotland | 4 | 40.0 | 45.0 | 9.2 | High | 1 |
5 | JackDaniel | Whiskey | USA | 5 | 40.0 | 50.0 | 9.5 | High | 1 |
6 | Absolut | Vodka | Sweden | 6 | 40.0 | 35.0 | 8.8 | High | 1 |
7 | Smirnoff | Vodka | UK | 7 | 37.5 | 30.0 | 8.2 | High | 0 |
8 | GreyGoose | Vodka | France | 8 | 40.0 | 60.0 | 9.0 | High | 1 |
9 | Sula | Wine | India | 9 | 12.0 | 10.0 | 7.0 | Medium | 0 |
10 | YellowTail | Wine | Australia | 10 | 13.0 | 12.0 | 7.8 | Medium | 0 |
11 | Barefoot | Wine | USA | 11 | 12.5 | 11.0 | 7.9 | Medium | 0 |
12 | Bacardi | Rum | PuertoRico | 12 | 35.0 | 25.0 | 8.5 | Medium | 1 |
13 | OldMonk | Rum | India | 13 | 42.8 | 15.0 | 8.0 | Medium | 0 |
14 | HavanaClub | Rum | Cuba | 14 | 40.0 | 20.0 | 8.6 | Medium | 1 |
15 | JimBeam | Whiskey | USA | 15 | 40.0 | 38.0 | 9.0 | High | 1 |
16 | Carlsberg | Beer | Denmark | 16 | 5.0 | 3.0 | 7.8 | Low | 0 |
17 | RedLabel | Whiskey | Scotland | 17 | 40.0 | 33.0 | 8.7 | High | 1 |
Step 9: Grouped Analysis Using PROC TABULATE
proc tabulate data=alcohol_extended;
class Category PriceCategory;
var Rating;
table Category, PriceCategory*Rating*(mean std);
title "Average Rating by Category and Price Level";
run;
Output:
Average Rating by Category and Price Level |
PriceCategory | ||||||
---|---|---|---|---|---|---|
High | Low | Medium | ||||
Rating | Rating | Rating | ||||
Mean | Std | Mean | Std | Mean | Std | |
Category | . | . | 7.95 | 0.42 | . | . |
Beer | ||||||
Rum | . | . | . | . | 8.37 | 0.32 |
Vodka | 8.67 | 0.42 | . | . | . | . |
Whiskey | 9.10 | 0.34 | . | . | . | . |
Wine | . | . | . | . | 7.57 | 0.49 |
Step 10: PROC SORT and PROC RANK for Top Brands
proc sort data=alcohol_extended out=sorted_alcohol;
by descending Rating;
run;
proc rank data=sorted_alcohol out=ranked_alcohol ties=low;
var Rating;
ranks RatingRank;
run;
proc print data=ranked_alcohol;
title "Alcohol Brands Ranked by Rating";
run;
Output:
Alcohol Brands Ranked by Rating |
Obs | Brand | Category | Country | AlcoholID | Alcohol_Content | Price | Rating | PriceCategory | HighRated | RatingRank |
---|---|---|---|---|---|---|---|---|---|---|
1 | JackDaniel | Whiskey | USA | 5 | 40.0 | 50.0 | 9.5 | High | 1 | 17 |
2 | Chivas | Whiskey | Scotland | 4 | 40.0 | 45.0 | 9.2 | High | 1 | 16 |
3 | GreyGoose | Vodka | France | 8 | 40.0 | 60.0 | 9.0 | High | 1 | 14 |
4 | JimBeam | Whiskey | USA | 15 | 40.0 | 38.0 | 9.0 | High | 1 | 14 |
5 | Absolut | Vodka | Sweden | 6 | 40.0 | 35.0 | 8.8 | High | 1 | 13 |
6 | RedLabel | Whiskey | Scotland | 17 | 40.0 | 33.0 | 8.7 | High | 1 | 12 |
7 | HavanaClub | Rum | Cuba | 14 | 40.0 | 20.0 | 8.6 | Medium | 1 | 11 |
8 | Heineken | Beer | Netherlands | 3 | 5.0 | 3.5 | 8.5 | Low | 1 | 9 |
9 | Bacardi | Rum | PuertoRico | 12 | 35.0 | 25.0 | 8.5 | Medium | 1 | 9 |
10 | Smirnoff | Vodka | UK | 7 | 37.5 | 30.0 | 8.2 | High | 0 | 8 |
11 | Budweiser | Beer | USA | 2 | 5.0 | 3.0 | 8.0 | Low | 0 | 6 |
12 | OldMonk | Rum | India | 13 | 42.8 | 15.0 | 8.0 | Medium | 0 | 6 |
13 | Barefoot | Wine | USA | 11 | 12.5 | 11.0 | 7.9 | Medium | 0 | 5 |
14 | YellowTail | Wine | Australia | 10 | 13.0 | 12.0 | 7.8 | Medium | 0 | 3 |
15 | Carlsberg | Beer | Denmark | 16 | 5.0 | 3.0 | 7.8 | Low | 0 | 3 |
16 | Kingfisher | Beer | India | 1 | 4.8 | 2.5 | 7.5 | Low | 0 | 2 |
17 | Sula | Wine | India | 9 | 12.0 | 10.0 | 7.0 | Medium | 0 | 1 |
Step 11: Data Cleaning Example Using PROC SQL
proc sql;
update alcohol_extended
set Brand = "Jack Daniel's"
where Brand = "JackDaniel";
quit;
proc print data=alcohol_extended;
where Brand = "Jack Daniel's";
title "Corrected Brand Name";
run;
Output:
Corrected Brand Name |
Obs | Brand | Category | Country | AlcoholID | Alcohol_Content | Price | Rating | PriceCategory | HighRated |
---|---|---|---|---|---|---|---|---|---|
5 | Jack Daniel's | Whiskey | USA | 5 | 40 | 50 | 9.5 | High | 1 |
Step 12: Exporting the Dataset
proc export data=alcohol_extended
outfile="alcohol_extended.csv"
dbms=csv
replace;
run;
Log:
NOTE: The file 'alcohol_extended.csv' is:
Filename=C:\sas folder\SASFoundation\9.4\alcohol_extended.csv,
RECFM=V,LRECL=32767,File Size (bytes)=0,
Last Modified=01 June 2025 17:35:54,
Create Time=01 June 2025 17:35:54
NOTE: 18 records were written to the file 'alcohol_extended.csv'.
The minimum record length was 32.
The maximum record length was 85.
NOTE: There were 17 observations read from the data set WORK.ALCOHOL_EXTENDED.
NOTE: DATA statement used (Total process time):
real time 0.17 seconds
cpu time 0.03 seconds
17 records created in alcohol_extended.csv from ALCOHOL_EXTENDED.
NOTE: "alcohol_extended.csv" file was successfully created.
NOTE: PROCEDURE EXPORT used (Total process time):
real time 0.90 seconds
cpu time 0.14 seconds
- Get link
- X
- Other Apps
Comments
Post a Comment