MUSICAL INSTRUMENTS DATA ANALYSIS USING SAS DATA STEP | PROC SQL | MEANS | REG | SGPLOT | MACROS | DATE FUNCTIONS | APPEND | MERGE | SET | TRANSPOSE
options nocenter;
1.CREATING INSTRUMENTS RAW DATASET
data instruments_raw;
length Instrument_Name $20 Origin_Country $20;
format Launch_Date date9.;
input Instrument_Name $ Origin_Country $ Sound_Level Complexity Popularity
Launch_Date :date9.;
datalines;
Guitar USA 85 6 90 01JAN2010
Violin Italy 78 8 88 15FEB2008
Flute India 65 5 70 10MAR2012
Drums USA 95 7 85 20APR2015
Piano Germany 80 9 92 05MAY2005
Sitar India 75 8 76 12JUN2009
Trumpet France 88 6 74 30JUL2011
Saxophone USA 90 7 80 18AUG2013
Tabla India 70 7 78 25SEP2014
Harp Ireland 68 9 60 02OCT2007
Cello Germany 72 8 66 19NOV2006
Ukulele Hawaii 60 4 82 10DEC2016
Harmonium India 73 6 71 05JAN2018
;
run;
proc print data=instruments_raw;
run;
OUTPUT:
| Obs | Instrument_Name | Origin_Country | Launch_Date | Sound_Level | Complexity | Popularity |
|---|---|---|---|---|---|---|
| 1 | Guitar | USA | 01JAN2010 | 85 | 6 | 90 |
| 2 | Violin | Italy | 15FEB2008 | 78 | 8 | 88 |
| 3 | Flute | India | 10MAR2012 | 65 | 5 | 70 |
| 4 | Drums | USA | 20APR2015 | 95 | 7 | 85 |
| 5 | Piano | Germany | 05MAY2005 | 80 | 9 | 92 |
| 6 | Sitar | India | 12JUN2009 | 75 | 8 | 76 |
| 7 | Trumpet | France | 30JUL2011 | 88 | 6 | 74 |
| 8 | Saxophone | USA | 18AUG2013 | 90 | 7 | 80 |
| 9 | Tabla | India | 25SEP2014 | 70 | 7 | 78 |
| 10 | Harp | Ireland | 02OCT2007 | 68 | 9 | 60 |
| 11 | Cello | Germany | 19NOV2006 | 72 | 8 | 66 |
| 12 | Ukulele | Hawaii | 10DEC2016 | 60 | 4 | 82 |
| 13 | Harmonium | India | 05JAN2018 | 73 | 6 | 71 |
2.DATE DERIVATIONS – MDY, INTNX, INTCK
data instruments_dates;
set instruments_raw;
Year_Introduced = year(Launch_Date);
Review_Date = intnx('year', Launch_Date, 1, 'same');
Years_Since_Launch = intck('year', Launch_Date, today());
Custom_Date = mdy(12,31,Year_Introduced);
format Review_Date Custom_Date date9.;
run;
proc print data=instruments_dates;
run;
OUTPUT:
| Obs | Instrument_Name | Origin_Country | Launch_Date | Sound_Level | Complexity | Popularity | Year_Introduced | Review_Date | Years_Since_Launch | Custom_Date |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Guitar | USA | 01JAN2010 | 85 | 6 | 90 | 2010 | 01JAN2011 | 16 | 31DEC2010 |
| 2 | Violin | Italy | 15FEB2008 | 78 | 8 | 88 | 2008 | 15FEB2009 | 18 | 31DEC2008 |
| 3 | Flute | India | 10MAR2012 | 65 | 5 | 70 | 2012 | 10MAR2013 | 14 | 31DEC2012 |
| 4 | Drums | USA | 20APR2015 | 95 | 7 | 85 | 2015 | 20APR2016 | 11 | 31DEC2015 |
| 5 | Piano | Germany | 05MAY2005 | 80 | 9 | 92 | 2005 | 05MAY2006 | 21 | 31DEC2005 |
| 6 | Sitar | India | 12JUN2009 | 75 | 8 | 76 | 2009 | 12JUN2010 | 17 | 31DEC2009 |
| 7 | Trumpet | France | 30JUL2011 | 88 | 6 | 74 | 2011 | 30JUL2012 | 15 | 31DEC2011 |
| 8 | Saxophone | USA | 18AUG2013 | 90 | 7 | 80 | 2013 | 18AUG2014 | 13 | 31DEC2013 |
| 9 | Tabla | India | 25SEP2014 | 70 | 7 | 78 | 2014 | 25SEP2015 | 12 | 31DEC2014 |
| 10 | Harp | Ireland | 02OCT2007 | 68 | 9 | 60 | 2007 | 02OCT2008 | 19 | 31DEC2007 |
| 11 | Cello | Germany | 19NOV2006 | 72 | 8 | 66 | 2006 | 19NOV2007 | 20 | 31DEC2006 |
| 12 | Ukulele | Hawaii | 10DEC2016 | 60 | 4 | 82 | 2016 | 10DEC2017 | 10 | 31DEC2016 |
| 13 | Harmonium | India | 05JAN2018 | 73 | 6 | 71 | 2018 | 05JAN2019 | 8 | 31DEC2018 |
3.PROC SQL – DATA EXTRACTION & DERIVATION
proc sql;
create table instruments_sql as
select Instrument_Name,Origin_Country,Sound_Level,Complexity,Popularity,
(Sound_Level * Complexity) as Impact_Score,Launch_Date
from instruments_dates
where Popularity > 70;
quit;
proc print data=instruments_sql;
run;
OUTPUT:
| Obs | Instrument_Name | Origin_Country | Sound_Level | Complexity | Popularity | Impact_Score | Launch_Date |
|---|---|---|---|---|---|---|---|
| 1 | Guitar | USA | 85 | 6 | 90 | 510 | 01JAN2010 |
| 2 | Violin | Italy | 78 | 8 | 88 | 624 | 15FEB2008 |
| 3 | Drums | USA | 95 | 7 | 85 | 665 | 20APR2015 |
| 4 | Piano | Germany | 80 | 9 | 92 | 720 | 05MAY2005 |
| 5 | Sitar | India | 75 | 8 | 76 | 600 | 12JUN2009 |
| 6 | Trumpet | France | 88 | 6 | 74 | 528 | 30JUL2011 |
| 7 | Saxophone | USA | 90 | 7 | 80 | 630 | 18AUG2013 |
| 8 | Tabla | India | 70 | 7 | 78 | 490 | 25SEP2014 |
| 9 | Ukulele | Hawaii | 60 | 4 | 82 | 240 | 10DEC2016 |
| 10 | Harmonium | India | 73 | 6 | 71 | 438 | 05JAN2018 |
4.MACRO – INSTRUMENT POPULARITY RATING
%macro popularity_rating(input_ds, output_ds);
data &output_ds;
set &input_ds;
length Popularity_Rating $12;
if Popularity >= 85 then Popularity_Rating = "HIGH";
else if Popularity >= 70 then Popularity_Rating = "MEDIUM";
else Popularity_Rating = "LOW";
run;
proc print data=&output_ds;
run;
%mend;
%popularity_rating(instruments_sql, instruments_rated);
OUTPUT:
| Obs | Instrument_Name | Origin_Country | Sound_Level | Complexity | Popularity | Impact_Score | Launch_Date | Popularity_Rating |
|---|---|---|---|---|---|---|---|---|
| 1 | Guitar | USA | 85 | 6 | 90 | 510 | 01JAN2010 | HIGH |
| 2 | Violin | Italy | 78 | 8 | 88 | 624 | 15FEB2008 | HIGH |
| 3 | Drums | USA | 95 | 7 | 85 | 665 | 20APR2015 | HIGH |
| 4 | Piano | Germany | 80 | 9 | 92 | 720 | 05MAY2005 | HIGH |
| 5 | Sitar | India | 75 | 8 | 76 | 600 | 12JUN2009 | MEDIUM |
| 6 | Trumpet | France | 88 | 6 | 74 | 528 | 30JUL2011 | MEDIUM |
| 7 | Saxophone | USA | 90 | 7 | 80 | 630 | 18AUG2013 | MEDIUM |
| 8 | Tabla | India | 70 | 7 | 78 | 490 | 25SEP2014 | MEDIUM |
| 9 | Ukulele | Hawaii | 60 | 4 | 82 | 240 | 10DEC2016 | MEDIUM |
| 10 | Harmonium | India | 73 | 6 | 71 | 438 | 05JAN2018 | MEDIUM |
5.PROC MEANS – DESCRIPTIVE STATISTICS
proc means data=instruments_rated n mean min max;
var Sound_Level Complexity Popularity Impact_Score;
run;
OUTPUT:
The MEANS Procedure
| Variable | N | Mean | Minimum | Maximum |
|---|---|---|---|---|
Sound_Level Complexity Popularity Impact_Score | 10 10 10 10 | 79.4000000 6.8000000 81.6000000 544.5000000 | 60.0000000 4.0000000 71.0000000 240.0000000 | 95.0000000 9.0000000 92.0000000 720.0000000 |
6.PROC REG – REGRESSION ANALYSIS
proc reg data=instruments_rated;
model Popularity = Sound_Level Complexity;
run;
quit;
OUTPUT:
The REG Procedure
Model: MODEL1
Dependent Variable: Popularity
| Number of Observations Read | 10 |
|---|---|
| Number of Observations Used | 10 |
| Analysis of Variance | |||||
|---|---|---|---|---|---|
| Source | DF | Sum of Squares | Mean Square | F Value | Pr > F |
| Model | 2 | 55.30204 | 27.65102 | 0.49 | 0.6308 |
| Error | 7 | 393.09796 | 56.15685 | ||
| Corrected Total | 9 | 448.40000 | |||
| Root MSE | 7.49379 | R-Square | 0.1233 |
|---|---|---|---|
| Dependent Mean | 81.60000 | Adj R-Sq | -0.1271 |
| Coeff Var | 9.18356 |
| Parameter Estimates | |||||
|---|---|---|---|---|---|
| Variable | DF | Parameter Estimate | Standard Error | t Value | Pr > |t| |
| Intercept | 1 | 66.02663 | 19.97850 | 3.30 | 0.0130 |
| Sound_Level | 1 | 0.06329 | 0.25418 | 0.25 | 0.8105 |
| Complexity | 1 | 1.55122 | 1.90479 | 0.81 | 0.4422 |
The REG Procedure
Model: MODEL1
Dependent Variable: Popularity
7.PROC SGPLOT – VISUALIZATION
proc sgplot data=instruments_rated;
scatter x=Sound_Level y=Popularity;
reg x=Sound_Level y=Popularity;
title "Popularity vs Sound Level of Musical Instruments";
run;
OUTPUT:
8.SET STATEMENT – COMBINING DATASETS
data instruments_combined;
set instruments_raw
instruments_rated;
run;
proc print data=instruments_combined;
run;
OUTPUT:
| Obs | Instrument_Name | Origin_Country | Launch_Date | Sound_Level | Complexity | Popularity | Impact_Score | Popularity_Rating |
|---|---|---|---|---|---|---|---|---|
| 1 | Guitar | USA | 01JAN2010 | 85 | 6 | 90 | . | |
| 2 | Violin | Italy | 15FEB2008 | 78 | 8 | 88 | . | |
| 3 | Flute | India | 10MAR2012 | 65 | 5 | 70 | . | |
| 4 | Drums | USA | 20APR2015 | 95 | 7 | 85 | . | |
| 5 | Piano | Germany | 05MAY2005 | 80 | 9 | 92 | . | |
| 6 | Sitar | India | 12JUN2009 | 75 | 8 | 76 | . | |
| 7 | Trumpet | France | 30JUL2011 | 88 | 6 | 74 | . | |
| 8 | Saxophone | USA | 18AUG2013 | 90 | 7 | 80 | . | |
| 9 | Tabla | India | 25SEP2014 | 70 | 7 | 78 | . | |
| 10 | Harp | Ireland | 02OCT2007 | 68 | 9 | 60 | . | |
| 11 | Cello | Germany | 19NOV2006 | 72 | 8 | 66 | . | |
| 12 | Ukulele | Hawaii | 10DEC2016 | 60 | 4 | 82 | . | |
| 13 | Harmonium | India | 05JAN2018 | 73 | 6 | 71 | . | |
| 14 | Guitar | USA | 01JAN2010 | 85 | 6 | 90 | 510 | HIGH |
| 15 | Violin | Italy | 15FEB2008 | 78 | 8 | 88 | 624 | HIGH |
| 16 | Drums | USA | 20APR2015 | 95 | 7 | 85 | 665 | HIGH |
| 17 | Piano | Germany | 05MAY2005 | 80 | 9 | 92 | 720 | HIGH |
| 18 | Sitar | India | 12JUN2009 | 75 | 8 | 76 | 600 | MEDIUM |
| 19 | Trumpet | France | 30JUL2011 | 88 | 6 | 74 | 528 | MEDIUM |
| 20 | Saxophone | USA | 18AUG2013 | 90 | 7 | 80 | 630 | MEDIUM |
| 21 | Tabla | India | 25SEP2014 | 70 | 7 | 78 | 490 | MEDIUM |
| 22 | Ukulele | Hawaii | 10DEC2016 | 60 | 4 | 82 | 240 | MEDIUM |
| 23 | Harmonium | India | 05JAN2018 | 73 | 6 | 71 | 438 | MEDIUM |
9.APPEND – ADDING NEW OBSERVATIONS
data new_instruments;
length Instrument_Name $20 Origin_Country $20;
format Launch_Date date9.;
Instrument_Name = "Banjo";
Origin_Country = "USA";
Sound_Level = 82;
Complexity = 6;
Popularity = 65;
Launch_Date = '15FEB2020'd;
run;
proc print data=new_instruments;
run;
OUTPUT:
| Obs | Instrument_Name | Origin_Country | Launch_Date | Sound_Level | Complexity | Popularity |
|---|---|---|---|---|---|---|
| 1 | Banjo | USA | 15FEB2020 | 82 | 6 | 65 |
proc append base=instruments_raw
data=new_instruments force;
run;
proc print data=instruments_raw;
run;
OUTPUT:
| Obs | Instrument_Name | Origin_Country | Launch_Date | Sound_Level | Complexity | Popularity |
|---|---|---|---|---|---|---|
| 1 | Guitar | USA | 01JAN2010 | 85 | 6 | 90 |
| 2 | Violin | Italy | 15FEB2008 | 78 | 8 | 88 |
| 3 | Flute | India | 10MAR2012 | 65 | 5 | 70 |
| 4 | Drums | USA | 20APR2015 | 95 | 7 | 85 |
| 5 | Piano | Germany | 05MAY2005 | 80 | 9 | 92 |
| 6 | Sitar | India | 12JUN2009 | 75 | 8 | 76 |
| 7 | Trumpet | France | 30JUL2011 | 88 | 6 | 74 |
| 8 | Saxophone | USA | 18AUG2013 | 90 | 7 | 80 |
| 9 | Tabla | India | 25SEP2014 | 70 | 7 | 78 |
| 10 | Harp | Ireland | 02OCT2007 | 68 | 9 | 60 |
| 11 | Cello | Germany | 19NOV2006 | 72 | 8 | 66 |
| 12 | Ukulele | Hawaii | 10DEC2016 | 60 | 4 | 82 |
| 13 | Harmonium | India | 05JAN2018 | 73 | 6 | 71 |
| 14 | Banjo | USA | 15FEB2020 | 82 | 6 | 65 |
10.MERGE – JOINING BY KEY VARIABLE
proc sort data=instruments_raw; by Instrument_Name; run;
proc print data=instruments_raw;
run;
OUTPUT:
| Obs | Instrument_Name | Origin_Country | Launch_Date | Sound_Level | Complexity | Popularity |
|---|---|---|---|---|---|---|
| 1 | Banjo | USA | 15FEB2020 | 82 | 6 | 65 |
| 2 | Cello | Germany | 19NOV2006 | 72 | 8 | 66 |
| 3 | Drums | USA | 20APR2015 | 95 | 7 | 85 |
| 4 | Flute | India | 10MAR2012 | 65 | 5 | 70 |
| 5 | Guitar | USA | 01JAN2010 | 85 | 6 | 90 |
| 6 | Harmonium | India | 05JAN2018 | 73 | 6 | 71 |
| 7 | Harp | Ireland | 02OCT2007 | 68 | 9 | 60 |
| 8 | Piano | Germany | 05MAY2005 | 80 | 9 | 92 |
| 9 | Saxophone | USA | 18AUG2013 | 90 | 7 | 80 |
| 10 | Sitar | India | 12JUN2009 | 75 | 8 | 76 |
| 11 | Tabla | India | 25SEP2014 | 70 | 7 | 78 |
| 12 | Trumpet | France | 30JUL2011 | 88 | 6 | 74 |
| 13 | Ukulele | Hawaii | 10DEC2016 | 60 | 4 | 82 |
| 14 | Violin | Italy | 15FEB2008 | 78 | 8 | 88 |
proc sort data=instruments_rated; by Instrument_Name; run;
proc print data=instruments_rated;
run;
OUTPUT:
| Obs | Instrument_Name | Origin_Country | Sound_Level | Complexity | Popularity | Impact_Score | Launch_Date | Popularity_Rating |
|---|---|---|---|---|---|---|---|---|
| 1 | Drums | USA | 95 | 7 | 85 | 665 | 20APR2015 | HIGH |
| 2 | Guitar | USA | 85 | 6 | 90 | 510 | 01JAN2010 | HIGH |
| 3 | Harmonium | India | 73 | 6 | 71 | 438 | 05JAN2018 | MEDIUM |
| 4 | Piano | Germany | 80 | 9 | 92 | 720 | 05MAY2005 | HIGH |
| 5 | Saxophone | USA | 90 | 7 | 80 | 630 | 18AUG2013 | MEDIUM |
| 6 | Sitar | India | 75 | 8 | 76 | 600 | 12JUN2009 | MEDIUM |
| 7 | Tabla | India | 70 | 7 | 78 | 490 | 25SEP2014 | MEDIUM |
| 8 | Trumpet | France | 88 | 6 | 74 | 528 | 30JUL2011 | MEDIUM |
| 9 | Ukulele | Hawaii | 60 | 4 | 82 | 240 | 10DEC2016 | MEDIUM |
| 10 | Violin | Italy | 78 | 8 | 88 | 624 | 15FEB2008 | HIGH |
data instruments_merged;
merge instruments_raw(in=a)
instruments_rated(in=b);
by Instrument_Name;
if a;
run;
proc print data=instruments_merged;
run;
OUTPUT:
| Obs | Instrument_Name | Origin_Country | Launch_Date | Sound_Level | Complexity | Popularity | Impact_Score | Popularity_Rating |
|---|---|---|---|---|---|---|---|---|
| 1 | Banjo | USA | 15FEB2020 | 82 | 6 | 65 | . | |
| 2 | Cello | Germany | 19NOV2006 | 72 | 8 | 66 | . | |
| 3 | Drums | USA | 20APR2015 | 95 | 7 | 85 | 665 | HIGH |
| 4 | Flute | India | 10MAR2012 | 65 | 5 | 70 | . | |
| 5 | Guitar | USA | 01JAN2010 | 85 | 6 | 90 | 510 | HIGH |
| 6 | Harmonium | India | 05JAN2018 | 73 | 6 | 71 | 438 | MEDIUM |
| 7 | Harp | Ireland | 02OCT2007 | 68 | 9 | 60 | . | |
| 8 | Piano | Germany | 05MAY2005 | 80 | 9 | 92 | 720 | HIGH |
| 9 | Saxophone | USA | 18AUG2013 | 90 | 7 | 80 | 630 | MEDIUM |
| 10 | Sitar | India | 12JUN2009 | 75 | 8 | 76 | 600 | MEDIUM |
| 11 | Tabla | India | 25SEP2014 | 70 | 7 | 78 | 490 | MEDIUM |
| 12 | Trumpet | France | 30JUL2011 | 88 | 6 | 74 | 528 | MEDIUM |
| 13 | Ukulele | Hawaii | 10DEC2016 | 60 | 4 | 82 | 240 | MEDIUM |
| 14 | Violin | Italy | 15FEB2008 | 78 | 8 | 88 | 624 | HIGH |
11.PROC TRANSPOSE – STRUCTURAL TRANSFORMATION
proc transpose data=instruments_rated out=transposed prefix=Value_;
by Instrument_Name;
var Sound_Level Complexity Popularity;
run;
proc print data=transposed;
run;
OUTPUT:
| Obs | Instrument_Name | _NAME_ | Value_1 |
|---|---|---|---|
| 1 | Drums | Sound_Level | 95 |
| 2 | Drums | Complexity | 7 |
| 3 | Drums | Popularity | 85 |
| 4 | Guitar | Sound_Level | 85 |
| 5 | Guitar | Complexity | 6 |
| 6 | Guitar | Popularity | 90 |
| 7 | Harmonium | Sound_Level | 73 |
| 8 | Harmonium | Complexity | 6 |
| 9 | Harmonium | Popularity | 71 |
| 10 | Piano | Sound_Level | 80 |
| 11 | Piano | Complexity | 9 |
| 12 | Piano | Popularity | 92 |
| 13 | Saxophone | Sound_Level | 90 |
| 14 | Saxophone | Complexity | 7 |
| 15 | Saxophone | Popularity | 80 |
| 16 | Sitar | Sound_Level | 75 |
| 17 | Sitar | Complexity | 8 |
| 18 | Sitar | Popularity | 76 |
| 19 | Tabla | Sound_Level | 70 |
| 20 | Tabla | Complexity | 7 |
| 21 | Tabla | Popularity | 78 |
| 22 | Trumpet | Sound_Level | 88 |
| 23 | Trumpet | Complexity | 6 |
| 24 | Trumpet | Popularity | 74 |
| 25 | Ukulele | Sound_Level | 60 |
| 26 | Ukulele | Complexity | 4 |
| 27 | Ukulele | Popularity | 82 |
| 28 | Violin | Sound_Level | 78 |
| 29 | Violin | Complexity | 8 |
| 30 | Violin | Popularity | 88 |
No comments:
Post a Comment