222.ORGANIZATIONAL DATA MANAGEMENT AND SEGMENTATION IN SAS USING PROC PRINT | PROC SQL | PROC MEANS | PROC FREQ | FOR DATA CLEANING VARIABLE ENRICHMENT CONDITIONAL UPDATES | DATA MERGING AND MISSING VALUE ANALYSIS

ORGANIZATIONAL DATA MANAGEMENT AND SEGMENTATION IN SAS USING PROC PRINT | PROC SQL | PROC MEANS | PROC FREQ | FOR DATA CLEANING VARIABLE ENRICHMENT CONDITIONAL UPDATES | DATA MERGING AND MISSING VALUE ANALYSIS

STEP 1 : CREATING THE DATASET

 DATA NEWDATA;

 INPUT ID NAME:&$:15. AGE CLASSTYPE$ SCHOOLTYPE:$15. OFFICETYPE:$15. DEPTTYPE:$18.;

 DATALINES;

1  Priya Sharma           34  Class_A     Public       Regional     Science

2  Rohan Verma            41  Class_B     Private      Central      Mathematics

3  Meera Pillai           29  Class_C     Public       HeadOffice   Literature

4  Arjun Mehta            38  Class_A     Private      Zonal        Physics

5  Sneha Reddy            31  Class_D     Public       Regional     Computer_Science

6  Kabir Iqbal            45  Class_B     Government   Central      History

7  Isha Rao               36  Class_C     Private      HeadOffice   Biology

8  Nikhil Thomas          40  Class_A     Public       Zonal        Economics

9  Ananya Das             28  Class_D     Government   Regional     Sociology

10 Vinay Kulkarni         33  Class_B     Private      Central      Commerce

;

RUN;

PROC PRINT;RUN;

OUTPUT:

Obs ID NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE
1 1 Priya Sharma 34 Class_A Public Regional Science
2 2 Rohan Verma 41 Class_B Private Central Mathematics
3 3 Meera Pillai 29 Class_C Public HeadOffice Literature
4 4 Arjun Mehta 38 Class_A Private Zonal Physics
5 5 Sneha Reddy 31 Class_D Public Regional Computer_Science
6 6 Kabir Iqbal 45 Class_B Government Central History
7 7 Isha Rao 36 Class_C Private HeadOffice Biology
8 8 Nikhil Thomas 40 Class_A Public Zonal Economics
9 9 Ananya Das 28 Class_D Government Regional Sociology
10 10 Vinay Kulkarni 33 Class_B Private Central Commerce


STEP 2 : USING ASSINGMENT STATEMENT CREATINE VARIABLES

DATA NEW01;

 RETAIN ID FIRST_NAME  LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE;

 SET NEWDATA;

 FIRST_NAME = SCAN(NAME,1);

 LAST_NAME  = SCAN(NAME,2);

 DROP NAME;

 AGE = AGE+1;

RUN;

PROC PRINT;

RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE
1 1 Priya Sharma 35 Class_A Public Regional Science
2 2 Rohan Verma 42 Class_B Private Central Mathematics
3 3 Meera Pillai 30 Class_C Public HeadOffice Literature
4 4 Arjun Mehta 39 Class_A Private Zonal Physics
5 5 Sneha Reddy 32 Class_D Public Regional Computer_Science
6 6 Kabir Iqbal 46 Class_B Government Central History
7 7 Isha Rao 37 Class_C Private HeadOffice Biology
8 8 Nikhil Thomas 41 Class_A Public Zonal Economics
9 9 Ananya Das 29 Class_D Government Regional Sociology
10 10 Vinay Kulkarni 34 Class_B Private Central Commerce


STEP 3 :CREATING 3 DIFFERENT DATASETS FROM ONE DATASET

DATA PRIVATE 

            PUBLIC 

   GOVT;

 SET NEW01;

 IF SCHOOLTYPE = "Private" THEN OUTPUT PRIVATE;

 ELSE IF SCHOOLTYPE = "Public" THEN OUTPUT PUBLIC;

 ELSE OUTPUT GOVT;

RUN;

PROC PRINT DATA=PRIVATE;

RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE
1 2 Rohan Verma 42 Class_B Private Central Mathematics
2 4 Arjun Mehta 39 Class_A Private Zonal Physics
3 7 Isha Rao 37 Class_C Private HeadOffice Biology
4 10 Vinay Kulkarni 34 Class_B Private Central Commerce

PROC PRINT DATA=PUBLIC;

RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE
1 1 Priya Sharma 35 Class_A Public Regional Science
2 3 Meera Pillai 30 Class_C Public HeadOffice Literature
3 5 Sneha Reddy 32 Class_D Public Regional Computer_Science
4 8 Nikhil Thomas 41 Class_A Public Zonal Economics

PROC PRINT DATA=GOVT;

RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE
1 6 Kabir Iqbal 46 Class_B Government Central History
2 9 Ananya Das 29 Class_D Government Regional Sociology


STEP 4 : ALTERING THE TABLE

PROC SQL;

 ALTER TABLE PRIVATE ADD FEES NUM FORMAT = COMMA12.;

QUIT;

PROC PRINT;RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE FEES
1 2 Rohan Verma 42 Class_B Private Central Mathematics .
2 4 Arjun Mehta 39 Class_A Private Zonal Physics .
3 7 Isha Rao 37 Class_C Private HeadOffice Biology .
4 10 Vinay Kulkarni 34 Class_B Private Central Commerce .

PROC SQL;

 ALTER TABLE PUBLIC ADD EMAIL CHAR (30);

RUN;

PROC PRINT;RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE EMAIL
1 1 Priya Sharma 35 Class_A Public Regional Science  
2 3 Meera Pillai 30 Class_C Public HeadOffice Literature  
3 5 Sneha Reddy 32 Class_D Public Regional Computer_Science  
4 8 Nikhil Thomas 41 Class_A Public Zonal Economics  

PROC SQL;

 ALTER TABLE GOVT ADD STATUS CHAR(15);

QUIT;

PROC PRINT;RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE STATUS
1 6 Kabir Iqbal 46 Class_B Government Central History  
2 9 Ananya Das 29 Class_D Government Regional Sociology  


STEP 5 : UPDATING THE ALTERED TABLE 

PROC SQL;

 UPDATE PRIVATE

 SET FEES = CASE

   WHEN ID = 2 THEN 300000

   WHEN ID = 4 THEN 500000

   WHEN ID = 7 THEN 320000

   WHEN ID = 10 THEN 400000

   ELSE FEES

   END;

QUIT;

PROC PRINT;RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE FEES
1 2 Rohan Verma 42 Class_B Private Central Mathematics 300,000
2 4 Arjun Mehta 39 Class_A Private Zonal Physics 500,000
3 7 Isha Rao 37 Class_C Private HeadOffice Biology 320,000
4 10 Vinay Kulkarni 34 Class_B Private Central Commerce 400,000


PROC SQL;

 UPDATE PUBLIC

  SET EMAIL = CASE

    WHEN ID = 1 THEN "Priya.Sharma@gmail.com"

    WHEN ID = 3 THEN "Meera.Pillai@gmail.com"

WHEN ID = 5 THEN "Sneha.Reddy@gmail.com"

WHEN ID = 8 THEN "Nikhil.THomas@gmail.com"

ELSE EMAIL

END;

QUIT;

PROC PRINT;RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE EMAIL
1 1 Priya Sharma 35 Class_A Public Regional Science Priya.Sharma@gmail.com
2 3 Meera Pillai 30 Class_C Public HeadOffice Literature Meera.Pillai@gmail.com
3 5 Sneha Reddy 32 Class_D Public Regional Computer_Science Sneha.Reddy@gmail.com
4 8 Nikhil Thomas 41 Class_A Public Zonal Economics Nikhil.THomas@gmail.com


PROC SQL;

 UPDATE GOVT 

  SET STATUS = CASE

   WHEN ID = 6 THEN "ACTIVE"

   WHEN ID = 9 THEN  "" 

   ELSE STATUS

   END;

QUIT;

PROC PRINT;RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE STATUS
1 6 Kabir Iqbal 46 Class_B Government Central History ACTIVE
2 9 Ananya Das 29 Class_D Government Regional Sociology  


STEP 6: MERGING THE UPDATES TABLES

DATA JOIN;

 MERGE PRIVATE

                PUBLIC

        GOVT;

BY ID;

RUN;

PROC PRINT;RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE FEES EMAIL STATUS
1 1 Priya Sharma 35 Class_A Public Regional Science . Priya.Sharma@gmail.com  
2 2 Rohan Verma 42 Class_B Private Central Mathematics 300,000    
3 3 Meera Pillai 30 Class_C Public HeadOffice Literature . Meera.Pillai@gmail.com  
4 4 Arjun Mehta 39 Class_A Private Zonal Physics 500,000    
5 5 Sneha Reddy 32 Class_D Public Regional Computer_Science . Sneha.Reddy@gmail.com  
6 6 Kabir Iqbal 46 Class_B Government Central History .   ACTIVE
7 7 Isha Rao 37 Class_C Private HeadOffice Biology 320,000    
8 8 Nikhil Thomas 41 Class_A Public Zonal Economics . Nikhil.THomas@gmail.com  
9 9 Ananya Das 29 Class_D Government Regional Sociology .    
10 10 Vinay Kulkarni 34 Class_B Private Central Commerce 400,000    


STEP 7: MISSING DATA

PROC MEANS DATA = JOIN NMISS;
RUN;

OUTPUT:

The MEANS Procedure

Variable N Miss
ID
AGE
FEES
0
0
6


PROC FREQ DATA = JOIN ;
 TABLES EMAIL / MISSING;
RUN;

OUTPUT:

The FREQ Procedure

EMAIL Frequency Percent Cumulative
Frequency
Cumulative
Percent
  6 60.00 6 60.00
Meera.Pillai@gmail.com 1 10.00 7 70.00
Nikhil.THomas@gmail.com 1 10.00 8 80.00
Priya.Sharma@gmail.com 1 10.00 9 90.00
Sneha.Reddy@gmail.com 1 10.00 10 100.00


PROC FREQ DATA = JOIN ;
 TABLES STATUS / MISSING;
RUN;

OUTPUT:

The FREQ Procedure

STATUS Frequency Percent Cumulative
Frequency
Cumulative
Percent
  9 90.00 9 90.00
ACTIVE 1 10.00 10 100.00



To Visit My Previous Proc  Means And Nway Option:Click Here
To Visit My Previous Proc Means And CharType Option:Click Here
To Visit My Previous SAS Functions:Click Here
To Visit My Previous Length Statement Using In Many Ways:Click Here





--->PLEASE FOLLOW OUR BLOG FOR MORE INFORMATION.
--->PLEASE DO COMMENTS AND SHARE OUR BLOG.

PLEASE FOLLOW OUR TELEGRAM CHANNEL CLICK HERE

PLEASE FOLLOW OUR FACEBOOK PAGE  CLICK HERE

PLEASE FOLLOW OUR INSTAGRAM PAGE CLICK HERE




Comments