204.ANALYZING SOFTWARE |APPLICATIONS ACROSS CATEGORIES USING PROC PRINT | PROC SORT | PROC MEANS | PROC FREQ | PROC SQL | PROC MACRO | SAS POWER | FOR SMART BUSINESS DECISIONS

ANALYZING SOFTWARE |APPLICATIONS ACROSS CATEGORIES USING  PROC PRINT | PROC SORT | PROC MEANS | PROC FREQ | PROC SQL | PROC MACRO | SAS POWER | FOR SMART BUSINESS  DECISIONS

/*Creating a unique SAS dataset about different types of software applications*/

Step 1: Creating the Software Dataset in SAS

data software_inventory;

    infile datalines dlm='' dsd truncover;

    input Software_ID $ Software_Name:$15. Category:$15. Version $ Vendor:$25. 

          License_Type:$15. Platform $ Cost Rating Popularity;

    datalines;

S01 Excel Spreadsheet 365 Microsoft Commercial Windows 500 4.5 95

S02 LibreOffice Spreadsheet 7.6 TheDocumentFoundation OpenSource Multi 0 4.2 70

S03 Photoshop Graphic 2024 Adobe Commercial Windows 1200 4.8 88

S04 GIMP Graphic 2.10 GNU OpenSource Multi 0 4.1 60

S05 SAS Analytics 9.4 SAS Commercial Windows 5000 4.9 85

S06 RStudio Analytics 2023 Posit OpenSource Multi 0 4.6 75

S07 MySQL Database 8.0 Oracle OpenSource Multi 0 4.5 78

S08 OracleDB Database 19c Oracle Commercial Multi 2500 4.7 80

S09 Zoom Communication 5.17 Zoom Commercial Multi 250 4.3 90

S10 Teams Communication 2023 Microsoft Commercial Multi 300 4.4 85

S11 Notepad++ Editor 8.5 DonHo OpenSource Windows 0 4.0 65

S12 VSCode Editor 1.89 Microsoft OpenSource Multi 0 4.8 92

S13 Tableau BI 2024 Salesforce Commercial Multi 3000 4.9 87

S14 PowerBI BI 2024 Microsoft Commercial Windows 1200 4.7 83

S15 VLC Media 3.0 VideoLAN OpenSource Multi 0 4.5 91

S16 AdobeReader Document 2024 Adobe Commercial Multi 100 4.2 76

S17 FoxitReader Document 2023 Foxit Commercial Multi 80 4.1 70

S18 Python Programming 3.12 PythonOrg OpenSource Multi 0 4.9 95

S19 MATLAB Programming R2023a MathWorks Commercial Multi 5000 4.6 72

S20 Git VersionControl 2.42 GitProject OpenSource Multi 0 4.7 90

;

run;

proc print;run;

Output:

Obs Software_ID Software_Name Category Version Vendor License_Type Platform Cost Rating Popularity
1 S01 Excel Spreadsheet 365 Microsoft Commercial Windows 500 4.5 95
2 S02 LibreOffice Spreadsheet 7.6 TheDocumentFoundation OpenSource Multi 0 4.2 70
3 S03 Photoshop Graphic 2024 Adobe Commercial Windows 1200 4.8 88
4 S04 GIMP Graphic 2.10 GNU OpenSource Multi 0 4.1 60
5 S05 SAS Analytics 9.4 SAS Commercial Windows 5000 4.9 85
6 S06 RStudio Analytics 2023 Posit OpenSource Multi 0 4.6 75
7 S07 MySQL Database 8.0 Oracle OpenSource Multi 0 4.5 78
8 S08 OracleDB Database 19c Oracle Commercial Multi 2500 4.7 80
9 S09 Zoom Communication 5.17 Zoom Commercial Multi 250 4.3 90
10 S10 Teams Communication 2023 Microsoft Commercial Multi 300 4.4 85
11 S11 Notepad++ Editor 8.5 DonHo OpenSource Windows 0 4.0 65
12 S12 VSCode Editor 1.89 Microsoft OpenSource Multi 0 4.8 92
13 S13 Tableau BI 2024 Salesforce Commercial Multi 3000 4.9 87
14 S14 PowerBI BI 2024 Microsoft Commercial Windows 1200 4.7 83
15 S15 VLC Media 3.0 VideoLAN OpenSource Multi 0 4.5 91
16 S16 AdobeReader Document 2024 Adobe Commercial Multi 100 4.2 76
17 S17 FoxitReader Document 2023 Foxit Commercial Multi 80 4.1 70
18 S18 Python Programming 3.12 PythonOrg OpenSource Multi 0 4.9 95
19 S19 MATLAB Programming R2023a MathWorks Commercial Multi 5000 4.6 72
20 S20 Git VersionControl 2.42 GitProject OpenSource Multi 0 4.7 90


Step 2: PROC PRINT – View the Dataset

proc print data=software_inventory label;

    title "Software Inventory Dataset - First Look";

run;

Output:

Software Inventory Dataset - First Look

Obs Software_ID Software_Name Category Version Vendor License_Type Platform Cost Rating Popularity
1 S01 Excel Spreadsheet 365 Microsoft Commercial Windows 500 4.5 95
2 S02 LibreOffice Spreadsheet 7.6 TheDocumentFoundation OpenSource Multi 0 4.2 70
3 S03 Photoshop Graphic 2024 Adobe Commercial Windows 1200 4.8 88
4 S04 GIMP Graphic 2.10 GNU OpenSource Multi 0 4.1 60
5 S05 SAS Analytics 9.4 SAS Commercial Windows 5000 4.9 85
6 S06 RStudio Analytics 2023 Posit OpenSource Multi 0 4.6 75
7 S07 MySQL Database 8.0 Oracle OpenSource Multi 0 4.5 78
8 S08 OracleDB Database 19c Oracle Commercial Multi 2500 4.7 80
9 S09 Zoom Communication 5.17 Zoom Commercial Multi 250 4.3 90
10 S10 Teams Communication 2023 Microsoft Commercial Multi 300 4.4 85
11 S11 Notepad++ Editor 8.5 DonHo OpenSource Windows 0 4.0 65
12 S12 VSCode Editor 1.89 Microsoft OpenSource Multi 0 4.8 92
13 S13 Tableau BI 2024 Salesforce Commercial Multi 3000 4.9 87
14 S14 PowerBI BI 2024 Microsoft Commercial Windows 1200 4.7 83
15 S15 VLC Media 3.0 VideoLAN OpenSource Multi 0 4.5 91
16 S16 AdobeReader Document 2024 Adobe Commercial Multi 100 4.2 76
17 S17 FoxitReader Document 2023 Foxit Commercial Multi 80 4.1 70
18 S18 Python Programming 3.12 PythonOrg OpenSource Multi 0 4.9 95
19 S19 MATLAB Programming R2023a MathWorks Commercial Multi 5000 4.6 72
20 S20 Git VersionControl 2.42 GitProject OpenSource Multi 0 4.7 90

Step 3: PROC SORT – Sorting by Category and Cost

proc sort data=software_inventory out=sorted_software;

    by Category descending Cost;

run;


proc print data=sorted_software;

    title "Sorted Software Inventory by Category and Cost";

run;

Output:

Sorted Software Inventory by Category and Cost

Obs Software_ID Software_Name Category Version Vendor License_Type Platform Cost Rating Popularity
1 S05 SAS Analytics 9.4 SAS Commercial Windows 5000 4.9 85
2 S06 RStudio Analytics 2023 Posit OpenSource Multi 0 4.6 75
3 S13 Tableau BI 2024 Salesforce Commercial Multi 3000 4.9 87
4 S14 PowerBI BI 2024 Microsoft Commercial Windows 1200 4.7 83
5 S10 Teams Communication 2023 Microsoft Commercial Multi 300 4.4 85
6 S09 Zoom Communication 5.17 Zoom Commercial Multi 250 4.3 90
7 S08 OracleDB Database 19c Oracle Commercial Multi 2500 4.7 80
8 S07 MySQL Database 8.0 Oracle OpenSource Multi 0 4.5 78
9 S16 AdobeReader Document 2024 Adobe Commercial Multi 100 4.2 76
10 S17 FoxitReader Document 2023 Foxit Commercial Multi 80 4.1 70
11 S11 Notepad++ Editor 8.5 DonHo OpenSource Windows 0 4.0 65
12 S12 VSCode Editor 1.89 Microsoft OpenSource Multi 0 4.8 92
13 S03 Photoshop Graphic 2024 Adobe Commercial Windows 1200 4.8 88
14 S04 GIMP Graphic 2.10 GNU OpenSource Multi 0 4.1 60
15 S15 VLC Media 3.0 VideoLAN OpenSource Multi 0 4.5 91
16 S19 MATLAB Programming R2023a MathWorks Commercial Multi 5000 4.6 72
17 S18 Python Programming 3.12 PythonOrg OpenSource Multi 0 4.9 95
18 S01 Excel Spreadsheet 365 Microsoft Commercial Windows 500 4.5 95
19 S02 LibreOffice Spreadsheet 7.6 TheDocumentFoundation OpenSource Multi 0 4.2 70
20 S20 Git VersionControl 2.42 GitProject OpenSource Multi 0 4.7 90

Step 4: PROC MEANS – Descriptive Stats of Cost, Rating, Popularity

proc means data=software_inventory mean min max std;

    var Cost Rating Popularity;

    title "Descriptive Statistics for Cost, Rating, and Popularity";

run;

Output:

Descriptive Statistics for Cost, Rating, and Popularity

The MEANS Procedure

Variable Mean Minimum Maximum Std Dev
Cost
Rating
Popularity
956.5000000
4.5200000
81.3500000
0
4.0000000
60.0000000
5000.00
4.9000000
95.0000000
1626.28
0.2894641
10.2200783

Step 5: PROC FREQ – Frequency Distribution of License_Type and Platform

proc freq data=software_inventory;

    tables License_Type Platform;

    title "Frequency Analysis of License Types and Platforms";

run;

Output:

Frequency Analysis of License Types and Platforms

The FREQ Procedure

License_Type Frequency Percent Cumulative
Frequency
Cumulative
Percent
Commercial 11 55.00 11 55.00
OpenSource 9 45.00 20 100.00

Platform Frequency Percent Cumulative
Frequency
Cumulative
Percent
Multi 15 75.00 15 75.00
Windows 5 25.00 20 100.00

Step 6: PROC SQL – Complex Querying

a) Average Rating by Software Category

proc sql;

    select Category, avg(Rating) as Avg_Rating

    from software_inventory

    group by Category

    order by Avg_Rating desc;

quit;

Output:

Category Avg_Rating
BI 4.8
Programming 4.75
Analytics 4.75
VersionControl 4.7
Database 4.6
Media 4.5
Graphic 4.45
Editor 4.4
Communication 4.35
Spreadsheet 4.35
Document 4.15

b) Software Above Average Popularity & Free

proc sql;

    select Software_Name, Popularity, License_Type

    from software_inventory

    where Popularity > (select avg(Popularity) from software_inventory)

          and License_Type = 'OpenSource';

quit;

Output:

Software_Name Popularity License_Type
VSCode 92 OpenSource
VLC 91 OpenSource
Python 95 OpenSource
Git 90 OpenSource

Step 7: PROC FORMAT – Better Labeling for Clarity

proc format;

    value $licensefmt

        'OpenSource' = 'Free & Open Source'

        'Commercial' = 'Paid/Proprietary';


    value ratingfmt

        low-4 = 'Below Average'

        4<-4.5 = 'Average'

        4.5<-5 = 'Above Average';

run;


proc print data=software_inventory;

    format License_Type $licensefmt. Rating ratingfmt.;

    title "Formatted Output of Software Inventory";

run;

Output:

Formatted Output of Software Inventory

Obs Software_ID Software_Name Category Version Vendor License_Type Platform Cost Rating Popularity
1 S01 Excel Spreadsheet 365 Microsoft Paid/Proprietary Windows 500 Average 95
2 S02 LibreOffice Spreadsheet 7.6 TheDocumentFoundation Free & Open Source Multi 0 Average 70
3 S03 Photoshop Graphic 2024 Adobe Paid/Proprietary Windows 1200 Above Average 88
4 S04 GIMP Graphic 2.10 GNU Free & Open Source Multi 0 Average 60
5 S05 SAS Analytics 9.4 SAS Paid/Proprietary Windows 5000 Above Average 85
6 S06 RStudio Analytics 2023 Posit Free & Open Source Multi 0 Above Average 75
7 S07 MySQL Database 8.0 Oracle Free & Open Source Multi 0 Average 78
8 S08 OracleDB Database 19c Oracle Paid/Proprietary Multi 2500 Above Average 80
9 S09 Zoom Communication 5.17 Zoom Paid/Proprietary Multi 250 Average 90
10 S10 Teams Communication 2023 Microsoft Paid/Proprietary Multi 300 Average 85
11 S11 Notepad++ Editor 8.5 DonHo Free & Open Source Windows 0 Below Average 65
12 S12 VSCode Editor 1.89 Microsoft Free & Open Source Multi 0 Above Average 92
13 S13 Tableau BI 2024 Salesforce Paid/Proprietary Multi 3000 Above Average 87
14 S14 PowerBI BI 2024 Microsoft Paid/Proprietary Windows 1200 Above Average 83
15 S15 VLC Media 3.0 VideoLAN Free & Open Source Multi 0 Average 91
16 S16 AdobeReader Document 2024 Adobe Paid/Proprietary Multi 100 Average 76
17 S17 FoxitReader Document 2023 Foxit Paid/Proprietary Multi 80 Average 70
18 S18 Python Programming 3.12 PythonOrg Free & Open Source Multi 0 Above Average 95
19 S19 MATLAB Programming R2023a MathWorks Paid/Proprietary Multi 5000 Above Average 72
20 S20 Git VersionControl 2.42 GitProject Free & Open Source Multi 0 Above Average 90

Step 8: Creating MACROS to Automate Analysis

Macro 1: Filter Software by Category

%macro filter_by_category(cat);

    proc print data=software_inventory;

        where Category = "&cat";

        title "Filtered Software List for Category: &cat";

    run;

%mend;


%filter_by_category(Analytics)

Output:

Filtered Software List for Category: Analytics

Obs Software_ID Software_Name Category Version Vendor License_Type Platform Cost Rating Popularity
5 S05 SAS Analytics 9.4 SAS Commercial Windows 5000 4.9 85
6 S06 RStudio Analytics 2023 Posit OpenSource Multi 0 4.6 75

%filter_by_category(Programming)

Output:

Filtered Software List for Category: Programming

Obs Software_ID Software_Name Category Version Vendor License_Type Platform Cost Rating Popularity
18 S18 Python Programming 3.12 PythonOrg OpenSource Multi 0 4.9 95
19 S19 MATLAB Programming R2023a MathWorks Commercial Multi 5000 4.6 72

Macro 2: Summarize by Platform

%macro summary_by_platform(platform);

    proc sql;

        select count(*) as Software_Count,

               avg(Cost) as Avg_Cost,

               avg(Popularity) as Avg_Popularity

        from software_inventory

        where Platform = "&platform";

    quit;

%mend;


%summary_by_platform(Windows)

Output:

Software_Count Avg_Cost Avg_Popularity
5 1580 83.2

%summary_by_platform(Multi)

Output:

Software_Count Avg_Cost Avg_Popularity
15 748.6667 80.73333

Step 9: Additional Logical Variables – High Cost or Not

data software_inventory_updated;

    set software_inventory;

    High_Cost = (Cost > 1000);

run;

proc print;run;

Output:

Obs Software_ID Software_Name Category Version Vendor License_Type Platform Cost Rating Popularity High_Cost
1 S01 Excel Spreadsheet 365 Microsoft Commercial Windows 500 4.5 95 0
2 S02 LibreOffice Spreadsheet 7.6 TheDocumentFoundation OpenSource Multi 0 4.2 70 0
3 S03 Photoshop Graphic 2024 Adobe Commercial Windows 1200 4.8 88 1
4 S04 GIMP Graphic 2.10 GNU OpenSource Multi 0 4.1 60 0
5 S05 SAS Analytics 9.4 SAS Commercial Windows 5000 4.9 85 1
6 S06 RStudio Analytics 2023 Posit OpenSource Multi 0 4.6 75 0
7 S07 MySQL Database 8.0 Oracle OpenSource Multi 0 4.5 78 0
8 S08 OracleDB Database 19c Oracle Commercial Multi 2500 4.7 80 1
9 S09 Zoom Communication 5.17 Zoom Commercial Multi 250 4.3 90 0
10 S10 Teams Communication 2023 Microsoft Commercial Multi 300 4.4 85 0
11 S11 Notepad++ Editor 8.5 DonHo OpenSource Windows 0 4.0 65 0
12 S12 VSCode Editor 1.89 Microsoft OpenSource Multi 0 4.8 92 0
13 S13 Tableau BI 2024 Salesforce Commercial Multi 3000 4.9 87 1
14 S14 PowerBI BI 2024 Microsoft Commercial Windows 1200 4.7 83 1
15 S15 VLC Media 3.0 VideoLAN OpenSource Multi 0 4.5 91 0
16 S16 AdobeReader Document 2024 Adobe Commercial Multi 100 4.2 76 0
17 S17 FoxitReader Document 2023 Foxit Commercial Multi 80 4.1 70 0
18 S18 Python Programming 3.12 PythonOrg OpenSource Multi 0 4.9 95 0
19 S19 MATLAB Programming R2023a MathWorks Commercial Multi 5000 4.6 72 1
20 S20 Git VersionControl 2.42 GitProject OpenSource Multi 0 4.7 90 0

proc freq data=software_inventory_updated;

    tables High_Cost;

    title "High Cost Software Count";

run;

Output:

High Cost Software Count

The FREQ Procedure

High_Cost Frequency Percent Cumulative
Frequency
Cumulative
Percent
0 14 70.00 14 70.00
1 6 30.00 20 100.00


Step 10: PROC REPORT – Stylish Summary

proc report data=software_inventory nowd;

    column Category Software_Name Cost Rating Popularity;

    define Category / group;

    define Software_Name / display;

    define Cost / analysis mean;

    define Rating / analysis mean;

    define Popularity / analysis mean;

    title "Report by Software Category";

run;

Output:

Report by Software Category

Category Software_Name Cost Rating Popularity
Analytics SAS 5000 4.9 85
  RStudio 0 4.6 75
BI Tableau 3000 4.9 87
  PowerBI 1200 4.7 83
Communication Zoom 250 4.3 90
  Teams 300 4.4 85
Database MySQL 0 4.5 78
  OracleDB 2500 4.7 80
Document AdobeReader 100 4.2 76
  FoxitReader 80 4.1 70
Editor Notepad++ 0 4 65
  VSCode 0 4.8 92
Graphic Photoshop 1200 4.8 88
  GIMP 0 4.1 60
Media VLC 0 4.5 91
Programming Python 0 4.9 95
  MATLAB 5000 4.6 72
Spreadsheet Excel 500 4.5 95
  LibreOffice 0 4.2 70
VersionControl Git 0 4.7 90

Step 11: Visualizations 

proc sgplot data=software_inventory;

    vbar Category / response=Cost stat=mean datalabel;

    title "Average Cost by Software Category";

run;

Log:

NOTE: PROCEDURE SGPLOT used (Total process time):
      real time           3.04 seconds
      cpu time            0.53 seconds

NOTE: Listing image output written to SGPlot1.png.
NOTE: There were 20 observations read from the data set WORK.SOFTWARE_INVENTORY.



To Visit My Previous First. Last. Statements:Click Here
To Visit My Previous Set Where Statements:Click Here
To Visit My Previous Online Gaming Dataset:Click Here
To Visit My Previous Length Statement Using In Many Ways:Click Here





--->PLEASE FOLLOW OUR BLOG FOR MORE INFORMATION.
--->PLEASE DO COMMENTS AND SHARE OUR BLOG.

PLEASE FOLLOW OUR TELEGRAM CHANNEL CLICK HERE

PLEASE FOLLOW OUR FACEBOOK PAGE  CLICK HERE

PLEASE FOLLOW OUR INSTAGRAM PAGE CLICK HERE


Comments

Popular posts from this blog

409.Can We Build a Reliable Emergency Services Analytics & Fraud Detection System in SAS While Identifying and Fixing Intentional Errors?

397.If a satellite has excellent signal strength but very high latency, can it still deliver good quality communication? Why or why not?A Sas Study

401.How Efficient Are Global Data Centers? A Complete SAS Analytics Study