43. DATA VALIDATION TECHNIQUES-1

                      DATA VALIDATION TECHNIQUES-1


3.PROC SUMMARY:

-->PROC SUMMARY IS USED FOR STATISTICS PURPOSE ..

-->OPTION PRINT IS MUST TO PRINT THE RESULTS..

-->IF YOU USED VAR STATEMENT THEN IT WILL GIVE RESULTS SAME LIKE 

PROC MEAN  IF NOT IT WILL GIVE TOTAL COUNT OF THE OBSERVATIONS..


PROC SUMMARY DATA=SASUSER.CLASS2 PRINT;

RUN;


LOG:

NOTE: Writing HTML Body file: sashtml.htm

NOTE: There were 19 observations read from the data set SASUSER.CLASS2.

NOTE: PROCEDURE SUMMARY used (Total process time):

      real time           2.56 seconds

      cpu time            0.23 seconds


RESULT:

The SUMMARY Procedure

N Obs
19


PROC SUMMARY DATA=SASUSER.CLASS2 PRINT;

VAR AGE HEIGHT;

RUN;


LOG:


NOTE: There were 19 observations read from the data set SASUSER.CLASS2.

NOTE: PROCEDURE SUMMARY used (Total process time):

      real time           0.09 seconds

      cpu time            0.00 seconds


RESULT:

Variable Label N Mean Std Dev Minimum Maximum
Age
Height
Age
Height
19
19
13.3157895
62.3368421
1.4926722
5.1270752
11.0000000
51.3000000
16.0000000
72.0000000


4.PROC UNIVARIATE:


PROC UNIVARIATE DATA=SASUSER.CLASS2;

RUN;


LOG:

NOTE: PROCEDURE UNIVARIATE used (Total process time):

      real time           0.56 seconds

      cpu time            0.06 seconds


RESULT:IT GIVES ALL VARIABLES Moments,

                                                          Basic Statistical Measures,

                                                          Tests for Location: Mu0=0,

                                                           Quantiles,

                                                           Extreme Observations...


The UNIVARIATE Procedure
Variable: Age (Age)

Moments
N 19 Sum Weights 19
Mean 13.3157895 Sum Observations 253
Std Deviation 1.49267216 Variance 2.22807018
Skewness 0.06361167 Kurtosis -1.1109255
Uncorrected SS 3409 Corrected SS 40.1052632
Coeff Variation 11.2097909 Std Error Mean 0.34244248


Basic Statistical Measures
Location Variability
Mean 13.31579 Std Deviation 1.49267
Median 13.00000 Variance 2.22807
Mode 12.00000 Range 5.00000
    Interquartile Range 3.00000


Tests for Location: Mu0=0
Test Statistic p Value
Student's t t 38.88475 Pr > |t| <.0001
Sign M 9.5 Pr >= |M| <.0001
Signed Rank S 95 Pr >= |S| <.0001


Quantiles (Definition 5)
Level Quantile
100% Max 16
99% 16
95% 16
90% 15
75% Q3 15
50% Median 13
25% Q1 12
10% 11
5% 11
1% 11
0% Min 11


Extreme Observations
Lowest Highest
Value Obs Value Obs
11 18 15 8
11 11 15 14
12 16 15 17
12 13 15 19
12 10 16 15


The UNIVARIATE Procedure
Variable: Height (Height)

Moments
N 19 Sum Weights 19
Mean 62.3368421 Sum Observations 1184.4
Std Deviation 5.12707525 Variance 26.2869006
Skewness -0.2596696 Kurtosis -0.1389692
Uncorrected SS 74304.92 Corrected SS 473.164211
Coeff Variation 8.22479143 Std Error Mean 1.17623173


Basic Statistical Measures
Location Variability
Mean 62.33684 Std Deviation 5.12708
Median 62.80000 Variance 26.28690
Mode 62.50000 Range 20.70000
    Interquartile Range 9.00000

Note: The mode displayed is the smallest of 2 modes with a count of 2.


Tests for Location: Mu0=0
Test Statistic p Value
Student's t t 52.99708 Pr > |t| <.0001
Sign M 9.5 Pr >= |M| <.0001
Signed Rank S 95 Pr >= |S| <.0001


Quantiles (Definition 5)
Level Quantile
100% Max 72.0
99% 72.0
95% 72.0
90% 69.0
75% Q3 66.5
50% Median 62.8
25% Q1 57.5
10% 56.3
5% 51.3
1% 51.3
0% Min 51.3


Extreme Observations
Lowest Highest
Value Obs Value Obs
51.3 11 66.5 14
56.3 13 66.5 19
56.5 2 67.0 17
57.3 6 69.0 1
57.5 18 72.0 15


The UNIVARIATE Procedure
Variable: Weight (Weight)

Moments
N 19 Sum Weights 19
Mean 100.026316 Sum Observations 1900.5
Std Deviation 22.7739335 Variance 518.652047
Skewness 0.18335097 Kurtosis 0.68336484
Uncorrected SS 199435.75 Corrected SS 9335.73684
Coeff Variation 22.7679419 Std Error Mean 5.22469867


Basic Statistical Measures
Location Variability
Mean 100.0263 Std Deviation 22.77393
Median 99.5000 Variance 518.65205
Mode 84.0000 Range 99.50000
    Interquartile Range 28.50000

Note: The mode displayed is the smallest of 4 modes with a count of 2.


Tests for Location: Mu0=0
Test Statistic p Value
Student's t t 19.1449 Pr > |t| <.0001
Sign M 9.5 Pr >= |M| <.0001
Signed Rank S 95 Pr >= |S| <.0001


Quantiles (Definition 5)
Level Quantile
100% Max 150.0
99% 150.0
95% 150.0
90% 133.0
75% Q3 112.5
50% Median 99.5
25% Q1 84.0
10% 77.0
5% 50.5
1% 50.5
0% Min 50.5


Extreme Observations
Lowest Highest
Value Obs Value Obs
50.5 11 112.5 1
77.0 13 112.5 8
83.0 6 128.0 16
84.0 9 133.0 17
84.0 2 150.0 15


The UNIVARIATE Procedure
Variable: DOB

Moments
N 19 Sum Weights 19
Mean 18482.2632 Sum Observations 351163
Std Deviation 442.316734 Variance 195644.094
Skewness 0.13633824 Kurtosis -0.4606266
Uncorrected SS 6493808571 Corrected SS 3521593.68
Coeff Variation 2.39319574 Std Error Mean 101.474418


Basic Statistical Measures
Location Variability
Mean 18482.26 Std Deviation 442.31673
Median 18367.00 Variance 195644
Mode 18215.00 Range 1706
    Interquartile Range 580.00000


Tests for Location: Mu0=0
Test Statistic p Value
Student's t t 182.1372 Pr > |t| <.0001
Sign M 9.5 Pr >= |M| <.0001
Signed Rank S 95 Pr >= |S| <.0001


Quantiles (Definition 5)
Level Quantile
100% Max 19344
99% 19344
95% 19344
90% 19127
75% Q3 18793
50% Median 18367
25% Q1 18213
10% 17943
5% 17638
1% 17638
0% Min 17638


Extreme Observations
Lowest Highest
Value Obs Value Obs
17638 15 18793 10
17943 19 18945 13
17968 17 18975 16
18182 1 19127 18
18213 4 19344 11


The UNIVARIATE Procedure
Variable: CLASS

Moments
N 19 Sum Weights 19
Mean 8.31578947 Sum Observations 158
Std Deviation 1.49267216 Variance 2.22807018
Skewness 0.06361167 Kurtosis -1.1109255
Uncorrected SS 1354 Corrected SS 40.1052632
Coeff Variation 17.9498551 Std Error Mean 0.34244248


Basic Statistical Measures
Location Variability
Mean 8.315789 Std Deviation 1.49267
Median 8.000000 Variance 2.22807
Mode 7.000000 Range 5.00000
    Interquartile Range 3.00000


Tests for Location: Mu0=0
Test Statistic p Value
Student's t t 24.28376 Pr > |t| <.0001
Sign M 9.5 Pr >= |M| <.0001
Signed Rank S 95 Pr >= |S| <.0001


Quantiles (Definition 5)
Level Quantile
100% Max 11
99% 11
95% 11
90% 10
75% Q3 10
50% Median 8
25% Q1 7
10% 6
5% 6
1% 6
0% Min 6


Extreme Observations
Lowest Highest
Value Obs Value Obs
6 18 10 8
6 11 10 14
7 16 10 17
7 13 10 19
7 10 11 15



PROC UNIVARIATE DATA=SASUSER.CLASS2;

VAR AGE;

RUN;


LOG:

NOTE: PROCEDURE UNIVARIATE used (Total process time):
      real time           0.07 seconds
      cpu time            0.01 seconds


RESULT:IF YOU MENTIONED SPECIFIC VARIABLE IN THE SYNTAX THEN OUTPUT WILL BE THAT VARIABLE ONLY..


The UNIVARIATE Procedure
Variable: Age (Age)

Moments
N 19 Sum Weights 19
Mean 13.3157895 Sum Observations 253
Std Deviation 1.49267216 Variance 2.22807018
Skewness 0.06361167 Kurtosis -1.1109255
Uncorrected SS 3409 Corrected SS 40.1052632
Coeff Variation 11.2097909 Std Error Mean 0.34244248

Basic Statistical Measures
Location Variability
Mean 13.31579 Std Deviation 1.49267
Median 13.00000 Variance 2.22807
Mode 12.00000 Range 5.00000
    Interquartile Range 3.00000

Tests for Location: Mu0=0
Test Statistic p Value
Student's t t 38.88475 Pr > |t| <.0001
Sign M 9.5 Pr >= |M| <.0001
Signed Rank S 95 Pr >= |S| <.0001

Quantiles (Definition 5)
Level Quantile
100% Max 16
99% 16
95% 16
90% 15
75% Q3 15
50% Median 13
25% Q1 12
10% 11
5% 11
1% 11
0% Min 11

Extreme Observations
Lowest Highest
Value Obs Value Obs
11 18 15 8
11 11 15 14
12 16 15 17
12 13 15 19
12 10 16 15


-->PLEASE READ AND COMMENT THE BLOG...

--PLEASE FOLLOW THE BLOG FOR MORE UPDATES...

--FOLLOW US IN FACEBOOK SASALL4YOU AND JOIN ...

--JOIN US IN FACEBOOK AND TELEGRAM  CHANNEL FOR MORE UPDATES

   CLICK HERE: https://t.me/SasAll4You







Comments