SOFTWARE COMPANY ANALYSIS USING PROC CONTENTS | PROC PRINT | PROC MEANS | PROC FREQ | PROC SORT | PROC REPORT | PROC SQL | PROC REG
/*Creating The Dataset Of Software Company */
1) Create the dataset
options nocenter;
data software_companies;
infile datalines dsd truncover;
length CompanyID 8 CompanyName $40 FoundedYear 8 Employees 8
Revenue_MnUSD 8 PrimaryProduct $30 HQ_City $30 Valuation_MnUSD 8;
input CompanyID CompanyName :$40. FoundedYear Employees Revenue_MnUSD
PrimaryProduct :$30. HQ_City :$30. Valuation_MnUSD;
RevenuePerEmployee = round((Revenue_MnUSD*1000000)/max(Employees,1),0.01);
format RevenuePerEmployee comma12.2 Revenue_MnUSD Valuation_MnUSD comma12.2;
datalines;
1,BlueOrbit Solutions,2008,120,18.5,Cloud ERP,Hyderabad,120.0
2,NeuraCode Labs,2015,45,6.2,AI Analytics,Bengaluru,32.0
3,StackWave Systems,2000,520,210.3,DevOps Platform,Pune,950.0
4,PixelForge Studio,2012,30,2.1,Mobile Games,Mumbai,11.5
5,QuantumBridge,2018,80,25.0,Blockchain Infra,Chennai,140.0
6,SafeGuard Tech,2005,220,55.4,Cybersecurity,New Delhi,410.0
7,DataSpring Inc,2010,150,38.0,Data Integration,Bengaluru,210.0
8,CloudVista,2014,300,90.7,Cloud Hosting,Hyderabad,480.0
9,EdgeLeap Systems,2020,25,1.0,Edge IoT,Coimbatore,6.5
10,OpenStream Soft,1998,900,512.3,Streaming Infra,Mumbai,3200.0
11,GreenByte Solutions,2003,60,7.4,Enterprise SaaS,Ahmedabad,45.0
12,MonoTech Labs,2016,12,0.4,Prototype Tools,Visakhapatnam,2.0
;
run;
proc print;run;
Output:
| Obs | CompanyID | CompanyName | FoundedYear | Employees | Revenue_MnUSD | PrimaryProduct | HQ_City | Valuation_MnUSD | RevenuePerEmployee |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | BlueOrbit Solutions | 2008 | 120 | 18.50 | Cloud ERP | Hyderabad | 120.00 | 154,166.67 |
| 2 | 2 | NeuraCode Labs | 2015 | 45 | 6.20 | AI Analytics | Bengaluru | 32.00 | 137,777.78 |
| 3 | 3 | StackWave Systems | 2000 | 520 | 210.30 | DevOps Platform | Pune | 950.00 | 404,423.08 |
| 4 | 4 | PixelForge Studio | 2012 | 30 | 2.10 | Mobile Games | Mumbai | 11.50 | 70,000.00 |
| 5 | 5 | QuantumBridge | 2018 | 80 | 25.00 | Blockchain Infra | Chennai | 140.00 | 312,500.00 |
| 6 | 6 | SafeGuard Tech | 2005 | 220 | 55.40 | Cybersecurity | New Delhi | 410.00 | 251,818.18 |
| 7 | 7 | DataSpring Inc | 2010 | 150 | 38.00 | Data Integration | Bengaluru | 210.00 | 253,333.33 |
| 8 | 8 | CloudVista | 2014 | 300 | 90.70 | Cloud Hosting | Hyderabad | 480.00 | 302,333.33 |
| 9 | 9 | EdgeLeap Systems | 2020 | 25 | 1.00 | Edge IoT | Coimbatore | 6.50 | 40,000.00 |
| 10 | 10 | OpenStream Soft | 1998 | 900 | 512.30 | Streaming Infra | Mumbai | 3,200.00 | 569,222.22 |
| 11 | 11 | GreenByte Solutions | 2003 | 60 | 7.40 | Enterprise SaaS | Ahmedabad | 45.00 | 123,333.33 |
| 12 | 12 | MonoTech Labs | 2016 | 12 | 0.40 | Prototype Tools | Visakhapatnam | 2.00 | 33,333.33 |
2) Basic exploration with PROCs
PROC CONTENTS — Purpose: Show dataset structure and variable attributes in one view.
proc contents data=software_companies;
run;
Output:
The CONTENTS Procedure
| Data Set Name | WORK.SOFTWARE_COMPANIES | Observations | 12 |
|---|---|---|---|
| Member Type | DATA | Variables | 9 |
| Engine | V9 | Indexes | 0 |
| Created | 09/06/2025 18:43:25 | Observation Length | 152 |
| Last Modified | 09/06/2025 18:43:25 | Deleted Observations | 0 |
| Protection | Compressed | NO | |
| Data Set Type | Sorted | NO | |
| Label | |||
| Data Representation | SOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LINUX_IA64 | ||
| Encoding | utf-8 Unicode (UTF-8) |
| Engine/Host Dependent Information | |
|---|---|
| Data Set Page Size | 131072 |
| Number of Data Set Pages | 1 |
| First Data Page | 1 |
| Max Obs per Page | 861 |
| Obs in First Data Page | 12 |
| Number of Data Set Repairs | 0 |
| Filename | /saswork/SAS_work47260000B4E9_odaws01-apse1-2.oda.sas.com/SAS_work70210000B4E9_odaws01-apse1-2.oda.sas.com/software_companies.sas7bdat |
| Release Created | 9.0401M8 |
| Host Created | Linux |
| Inode Number | 134456831 |
| Access Permission | rw-r--r-- |
| Owner Name | u63247146 |
| File Size | 256KB |
| File Size (bytes) | 262144 |
| Alphabetic List of Variables and Attributes | ||||
|---|---|---|---|---|
| # | Variable | Type | Len | Format |
| 1 | CompanyID | Num | 8 | |
| 2 | CompanyName | Char | 40 | |
| 4 | Employees | Num | 8 | |
| 3 | FoundedYear | Num | 8 | |
| 7 | HQ_City | Char | 30 | |
| 6 | PrimaryProduct | Char | 30 | |
| 9 | RevenuePerEmployee | Num | 8 | COMMA12.2 |
| 5 | Revenue_MnUSD | Num | 8 | COMMA12.2 |
| 8 | Valuation_MnUSD | Num | 8 | COMMA12.2 |
PROC PRINT — Purpose: Print the raw observations for quick manual inspection.
proc print data=software_companies noobs;
run;
Output:
| CompanyID | CompanyName | FoundedYear | Employees | Revenue_MnUSD | PrimaryProduct | HQ_City | Valuation_MnUSD | RevenuePerEmployee |
|---|---|---|---|---|---|---|---|---|
| 1 | BlueOrbit Solutions | 2008 | 120 | 18.50 | Cloud ERP | Hyderabad | 120.00 | 154,166.67 |
| 2 | NeuraCode Labs | 2015 | 45 | 6.20 | AI Analytics | Bengaluru | 32.00 | 137,777.78 |
| 3 | StackWave Systems | 2000 | 520 | 210.30 | DevOps Platform | Pune | 950.00 | 404,423.08 |
| 4 | PixelForge Studio | 2012 | 30 | 2.10 | Mobile Games | Mumbai | 11.50 | 70,000.00 |
| 5 | QuantumBridge | 2018 | 80 | 25.00 | Blockchain Infra | Chennai | 140.00 | 312,500.00 |
| 6 | SafeGuard Tech | 2005 | 220 | 55.40 | Cybersecurity | New Delhi | 410.00 | 251,818.18 |
| 7 | DataSpring Inc | 2010 | 150 | 38.00 | Data Integration | Bengaluru | 210.00 | 253,333.33 |
| 8 | CloudVista | 2014 | 300 | 90.70 | Cloud Hosting | Hyderabad | 480.00 | 302,333.33 |
| 9 | EdgeLeap Systems | 2020 | 25 | 1.00 | Edge IoT | Coimbatore | 6.50 | 40,000.00 |
| 10 | OpenStream Soft | 1998 | 900 | 512.30 | Streaming Infra | Mumbai | 3,200.00 | 569,222.22 |
| 11 | GreenByte Solutions | 2003 | 60 | 7.40 | Enterprise SaaS | Ahmedabad | 45.00 | 123,333.33 |
| 12 | MonoTech Labs | 2016 | 12 | 0.40 | Prototype Tools | Visakhapatnam | 2.00 | 33,333.33 |
PROC MEANS — Purpose: Get numeric summary statistics (N, mean, std, min, max) for numeric variables.
proc means data=software_companies n mean std min max maxdec=2;
var FoundedYear Employees Revenue_MnUSD Valuation_MnUSD RevenuePerEmployee;
run;
Output:
The MEANS Procedure
| Variable | N | Mean | Std Dev | Minimum | Maximum |
|---|---|---|---|---|---|
FoundedYear Employees Revenue_MnUSD Valuation_MnUSD RevenuePerEmployee | 12 12 12 12 12 | 2009.92 205.17 80.61 467.25 221020.10 | 7.18 263.59 148.48 904.66 160566.47 | 1998.00 12.00 0.40 2.00 33333.33 | 2020.00 900.00 512.30 3200.00 569222.22 |
PROC FREQ — Purpose: Frequency counts for categorical variables to see distribution.
proc freq data=software_companies;
tables PrimaryProduct HQ_City / nocum nopercent;
run;
Output:
The FREQ Procedure
| PrimaryProduct | Frequency |
|---|---|
| AI Analytics | 1 |
| Blockchain Infra | 1 |
| Cloud ERP | 1 |
| Cloud Hosting | 1 |
| Cybersecurity | 1 |
| Data Integration | 1 |
| DevOps Platform | 1 |
| Edge IoT | 1 |
| Enterprise SaaS | 1 |
| Mobile Games | 1 |
| Prototype Tools | 1 |
| Streaming Infra | 1 |
| HQ_City | Frequency |
|---|---|
| Ahmedabad | 1 |
| Bengaluru | 2 |
| Chennai | 1 |
| Coimbatore | 1 |
| Hyderabad | 2 |
| Mumbai | 2 |
| New Delhi | 1 |
| Pune | 1 |
| Visakhapatnam | 1 |
PROC SORT — Purpose: Sort dataset by metric (e.g., Revenue) for reporting.
proc sort data=software_companies out=swc_sorted_by_revenue;
by descending Revenue_MnUSD;
run;
proc print data=swc_sorted_by_revenue;run;
Output:
| Obs | CompanyID | CompanyName | FoundedYear | Employees | Revenue_MnUSD | PrimaryProduct | HQ_City | Valuation_MnUSD | RevenuePerEmployee |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 10 | OpenStream Soft | 1998 | 900 | 512.30 | Streaming Infra | Mumbai | 3,200.00 | 569,222.22 |
| 2 | 3 | StackWave Systems | 2000 | 520 | 210.30 | DevOps Platform | Pune | 950.00 | 404,423.08 |
| 3 | 8 | CloudVista | 2014 | 300 | 90.70 | Cloud Hosting | Hyderabad | 480.00 | 302,333.33 |
| 4 | 6 | SafeGuard Tech | 2005 | 220 | 55.40 | Cybersecurity | New Delhi | 410.00 | 251,818.18 |
| 5 | 7 | DataSpring Inc | 2010 | 150 | 38.00 | Data Integration | Bengaluru | 210.00 | 253,333.33 |
| 6 | 5 | QuantumBridge | 2018 | 80 | 25.00 | Blockchain Infra | Chennai | 140.00 | 312,500.00 |
| 7 | 1 | BlueOrbit Solutions | 2008 | 120 | 18.50 | Cloud ERP | Hyderabad | 120.00 | 154,166.67 |
| 8 | 11 | GreenByte Solutions | 2003 | 60 | 7.40 | Enterprise SaaS | Ahmedabad | 45.00 | 123,333.33 |
| 9 | 2 | NeuraCode Labs | 2015 | 45 | 6.20 | AI Analytics | Bengaluru | 32.00 | 137,777.78 |
| 10 | 4 | PixelForge Studio | 2012 | 30 | 2.10 | Mobile Games | Mumbai | 11.50 | 70,000.00 |
| 11 | 9 | EdgeLeap Systems | 2020 | 25 | 1.00 | Edge IoT | Coimbatore | 6.50 | 40,000.00 |
| 12 | 12 | MonoTech Labs | 2016 | 12 | 0.40 | Prototype Tools | Visakhapatnam | 2.00 | 33,333.33 |
PROC REPORT — Purpose: Create a compact summary table combining computed columns for reporting.
proc report data=swc_sorted_by_revenue nowd;
columns CompanyID CompanyName HQ_City Employees Revenue_MnUSD RevenuePerEmployee Valuation_MnUSD;
define CompanyName / display width=30;
define Revenue_MnUSD / analysis format=comma12.2;
define RevenuePerEmployee / analysis format=comma12.2;
run;
Output:
| CompanyID | CompanyName | HQ_City | Employees | Revenue_MnUSD | RevenuePerEmployee | Valuation_MnUSD |
|---|---|---|---|---|---|---|
| 10 | OpenStream Soft | Mumbai | 900 | 512.30 | 569,222.22 | 3,200.00 |
| 3 | StackWave Systems | Pune | 520 | 210.30 | 404,423.08 | 950.00 |
| 8 | CloudVista | Hyderabad | 300 | 90.70 | 302,333.33 | 480.00 |
| 6 | SafeGuard Tech | New Delhi | 220 | 55.40 | 251,818.18 | 410.00 |
| 7 | DataSpring Inc | Bengaluru | 150 | 38.00 | 253,333.33 | 210.00 |
| 5 | QuantumBridge | Chennai | 80 | 25.00 | 312,500.00 | 140.00 |
| 1 | BlueOrbit Solutions | Hyderabad | 120 | 18.50 | 154,166.67 | 120.00 |
| 11 | GreenByte Solutions | Ahmedabad | 60 | 7.40 | 123,333.33 | 45.00 |
| 2 | NeuraCode Labs | Bengaluru | 45 | 6.20 | 137,777.78 | 32.00 |
| 4 | PixelForge Studio | Mumbai | 30 | 2.10 | 70,000.00 | 11.50 |
| 9 | EdgeLeap Systems | Coimbatore | 25 | 1.00 | 40,000.00 | 6.50 |
| 12 | MonoTech Labs | Visakhapatnam | 12 | 0.40 | 33,333.33 | 2.00 |
3) PROC SQL examples
PROC SQL — Purpose: Create aggregated summary
(total revenue and avg employees by HQ\_City).
proc sql;
create table city_summary as
select HQ_City,
count(*) as NumCompanies,
sum(Revenue_MnUSD) as TotalRevenue_MnUSD format=comma12.2,
mean(Employees) as AvgEmployees format=8.2
from software_companies
group by HQ_City
order by calculated TotalRevenue_MnUSD desc;
quit;
proc print data=city_summary;run;
Output:
| Obs | HQ_City | NumCompanies | TotalRevenue_MnUSD | AvgEmployees |
|---|---|---|---|---|
| 1 | Mumbai | 2 | 514.40 | 465.00 |
| 2 | Pune | 1 | 210.30 | 520.00 |
| 3 | Hyderabad | 2 | 109.20 | 210.00 |
| 4 | New Delhi | 1 | 55.40 | 220.00 |
| 5 | Bengaluru | 2 | 44.20 | 97.50 |
| 6 | Chennai | 1 | 25.00 | 80.00 |
| 7 | Ahmedabad | 1 | 7.40 | 60.00 |
| 8 | Coimbatore | 1 | 1.00 | 25.00 |
| 9 | Visakhapatnam | 1 | 0.40 | 12.00 |
PROC SQL — Purpose: Select top 5 companies by valuation.
proc sql outobs=5;
select CompanyID, CompanyName, Valuation_MnUSD
from software_companies
order by Valuation_MnUSD desc;
quit;
Output:
| CompanyID | CompanyName | Valuation_MnUSD |
|---|---|---|
| 10 | OpenStream Soft | 3,200.00 |
| 3 | StackWave Systems | 950.00 |
| 8 | CloudVista | 480.00 |
| 6 | SafeGuard Tech | 410.00 |
| 7 | DataSpring Inc | 210.00 |
PROC SQL — Purpose: Inner join with a simulated investment table to show merge via SQL.
data investments;
infile datalines dsd truncover;
length CompanyID 8 Investor $40 Investment_MnUSD 8 Round $10;
input CompanyID Investor :$40. Investment_MnUSD Round :$10.;
datalines;
1,AlphaVentures,5.0,SeriesA
2,NeuroFund,2.0,Seed
3,InfraCap,50.0,SeriesC
6,ShieldPartners,20.0,SeriesB
8,CloudHoldings,100.0,SeriesD
10,StreamGlobal,200.0,SeriesF
7,DataSeed,10.0,SeriesB
;
run;
proc print;run;
Output:
| Obs | CompanyID | Investor | Investment_MnUSD | Round |
|---|---|---|---|---|
| 1 | 1 | AlphaVentures | 5 | SeriesA |
| 2 | 2 | NeuroFund | 2 | Seed |
| 3 | 3 | InfraCap | 50 | SeriesC |
| 4 | 6 | ShieldPartners | 20 | SeriesB |
| 5 | 8 | CloudHoldings | 100 | SeriesD |
| 6 | 10 | StreamGlobal | 200 | SeriesF |
| 7 | 7 | DataSeed | 10 | SeriesB |
proc sql;
create table company_investments as
select a.*, b.Investor, b.Investment_MnUSD, b.Round
from software_companies as a
left join investments as b
on a.CompanyID = b.CompanyID;
quit;
proc print data=company_investments;run;
Output:
| Obs | CompanyID | CompanyName | FoundedYear | Employees | Revenue_MnUSD | PrimaryProduct | HQ_City | Valuation_MnUSD | RevenuePerEmployee | Investor | Investment_MnUSD | Round |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | BlueOrbit Solutions | 2008 | 120 | 18.50 | Cloud ERP | Hyderabad | 120.00 | 154,166.67 | AlphaVentures | 5 | SeriesA |
| 2 | 2 | NeuraCode Labs | 2015 | 45 | 6.20 | AI Analytics | Bengaluru | 32.00 | 137,777.78 | NeuroFund | 2 | Seed |
| 3 | 3 | StackWave Systems | 2000 | 520 | 210.30 | DevOps Platform | Pune | 950.00 | 404,423.08 | InfraCap | 50 | SeriesC |
| 4 | 4 | PixelForge Studio | 2012 | 30 | 2.10 | Mobile Games | Mumbai | 11.50 | 70,000.00 | . | ||
| 5 | 5 | QuantumBridge | 2018 | 80 | 25.00 | Blockchain Infra | Chennai | 140.00 | 312,500.00 | . | ||
| 6 | 6 | SafeGuard Tech | 2005 | 220 | 55.40 | Cybersecurity | New Delhi | 410.00 | 251,818.18 | ShieldPartners | 20 | SeriesB |
| 7 | 7 | DataSpring Inc | 2010 | 150 | 38.00 | Data Integration | Bengaluru | 210.00 | 253,333.33 | DataSeed | 10 | SeriesB |
| 8 | 8 | CloudVista | 2014 | 300 | 90.70 | Cloud Hosting | Hyderabad | 480.00 | 302,333.33 | CloudHoldings | 100 | SeriesD |
| 9 | 9 | EdgeLeap Systems | 2020 | 25 | 1.00 | Edge IoT | Coimbatore | 6.50 | 40,000.00 | . | ||
| 10 | 10 | OpenStream Soft | 1998 | 900 | 512.30 | Streaming Infra | Mumbai | 3,200.00 | 569,222.22 | StreamGlobal | 200 | SeriesF |
| 11 | 11 | GreenByte Solutions | 2003 | 60 | 7.40 | Enterprise SaaS | Ahmedabad | 45.00 | 123,333.33 | . | ||
| 12 | 12 | MonoTech Labs | 2016 | 12 | 0.40 | Prototype Tools | Visakhapatnam | 2.00 | 33,333.33 | . |
4) SAS MACROS
Macro: %top_companies_by_metric
— Purpose: Reusable macro to print top N companies by any numeric metric.
%macro top_companies_by_metric(metric=Revenue_MnUSD, n=5);
%put NOTE: Running top_companies_by_metric for &metric, top &n.;
proc sql outobs=&n;
select CompanyID, CompanyName, &metric
from software_companies
order by &metric desc;
quit;
%mend top_companies_by_metric;
%top_companies_by_metric(metric=Valuation_MnUSD,n=5);
Output:
| CompanyID | CompanyName | Valuation_MnUSD |
|---|---|---|
| 10 | OpenStream Soft | 3,200.00 |
| 3 | StackWave Systems | 950.00 |
| 8 | CloudVista | 480.00 |
| 6 | SafeGuard Tech | 410.00 |
| 7 | DataSpring Inc | 210.00 |
%top_companies_by_metric(metric=Revenue_MnUSD,n=3);
Output:
| CompanyID | CompanyName | Revenue_MnUSD |
|---|---|---|
| 10 | OpenStream Soft | 512.30 |
| 3 | StackWave Systems | 210.30 |
| 8 | CloudVista | 90.70 |
Macro: %segment_filter
— Purpose: Create a dataset for companies founded before/after a given year.
%macro segment_filter(year=2010, outds=before_after);
%if &year = %then %let year=2010;
data &outds;
set software_companies;
if FoundedYear < &year then Segment = 'Before_&year';
else Segment = 'After_&year';
run;
proc freq data=&outds; tables Segment / nocum nopercent; run;
%mend segment_filter;
%segment_filter(year=2010,outds=seg_2010);
Output:
The FREQ Procedure
| Segment | Frequency |
|---|---|
| After_&year | 7 |
| Before_&year | 5 |
Macro: %calc_growth_estimate
— Purpose: Estimate simple hypothetical valuation growth over X years at assumed CAGR.
%macro calc_growth_estimate(cagr=0.20, years=3);
data growth_estimates;
set software_companies;
FutureValuation = round(Valuation_MnUSD * (1 + &cagr)**&years,0.1);
CAGR = &cagr;
Years = &years;
run;
proc sort data=growth_estimates; by descending FutureValuation; run;
proc print data=growth_estimates (obs=12) noobs;
var CompanyName Valuation_MnUSD FutureValuation Years CAGR;
run;
%mend calc_growth_estimate;
%calc_growth_estimate(cagr=0.25, years=5);
Output:
| CompanyName | Valuation_MnUSD | FutureValuation | Years | CAGR |
|---|---|---|---|---|
| OpenStream Soft | 3,200.00 | 9765.6 | 5 | 0.25 |
| StackWave Systems | 950.00 | 2899.2 | 5 | 0.25 |
| CloudVista | 480.00 | 1464.8 | 5 | 0.25 |
| SafeGuard Tech | 410.00 | 1251.2 | 5 | 0.25 |
| DataSpring Inc | 210.00 | 640.9 | 5 | 0.25 |
| QuantumBridge | 140.00 | 427.2 | 5 | 0.25 |
| BlueOrbit Solutions | 120.00 | 366.2 | 5 | 0.25 |
| GreenByte Solutions | 45.00 | 137.3 | 5 | 0.25 |
| NeuraCode Labs | 32.00 | 97.7 | 5 | 0.25 |
| PixelForge Studio | 11.50 | 35.1 | 5 | 0.25 |
| EdgeLeap Systems | 6.50 | 19.8 | 5 | 0.25 |
| MonoTech Labs | 2.00 | 6.1 | 5 | 0.25 |
5) Validation & QC checks
Purpose: Ensure data sanity
proc sort data=software_companies out=swc_dup nodupkey dupout=dups;
by CompanyID;
run;
proc print data=dups;
title 'Duplicate CompanyIDs if any';
run;
Log:
proc freq data=software_companies;
tables Employees / missing;
run;
Output:
The FREQ Procedure
| Employees | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| 12 | 1 | 8.33 | 1 | 8.33 |
| 25 | 1 | 8.33 | 2 | 16.67 |
| 30 | 1 | 8.33 | 3 | 25.00 |
| 45 | 1 | 8.33 | 4 | 33.33 |
| 60 | 1 | 8.33 | 5 | 41.67 |
| 80 | 1 | 8.33 | 6 | 50.00 |
| 120 | 1 | 8.33 | 7 | 58.33 |
| 150 | 1 | 8.33 | 8 | 66.67 |
| 220 | 1 | 8.33 | 9 | 75.00 |
| 300 | 1 | 8.33 | 10 | 83.33 |
| 520 | 1 | 8.33 | 11 | 91.67 |
| 900 | 1 | 8.33 | 12 | 100.00 |
| CountNegEmployees |
|---|
| 0 |
proc sql;
select count(*) as CountNegEmployees from software_companies
where Employees < 0 or Revenue_MnUSD < 0 or Valuation_MnUSD < 0;
quit;
Output:
6) Example advanced analytics
Purpose: simple linear regression of valuation on revenue & employees
proc reg data=software_companies;
model Valuation_MnUSD = Revenue_MnUSD Employees;
title 'Simple regression: Valuation predicted by Revenue and Employees';
run; quit;
Output:
The REG Procedure
Model: MODEL1
Dependent Variable: Valuation_MnUSD
| Number of Observations Read | 12 |
|---|---|
| Number of Observations Used | 12 |
| Analysis of Variance | |||||
|---|---|---|---|---|---|
| Source | DF | Sum of Squares | Mean Square | F Value | Pr > F |
| Model | 2 | 8923884 | 4461942 | 511.09 | <.0001 |
| Error | 9 | 78573 | 8730.32769 | ||
| Corrected Total | 11 | 9002457 | |||
| Root MSE | 93.43622 | R-Square | 0.9913 |
|---|---|---|---|
| Dependent Mean | 467.25000 | Adj R-Sq | 0.9893 |
| Coeff Var | 19.99705 |
| Parameter Estimates | |||||
|---|---|---|---|---|---|
| Variable | DF | Parameter Estimate | Standard Error | t Value | Pr > |t| |
| Intercept | 1 | 56.10443 | 46.93671 | 1.20 | 0.2625 |
| Revenue_MnUSD | 1 | 8.11022 | 0.96438 | 8.41 | <.0001 |
| Employees | 1 | -1.18248 | 0.54324 | -2.18 | 0.0575 |
The REG Procedure
Model: MODEL1
Dependent Variable: Valuation_MnUSD
No comments:
Post a Comment