222.ORGANIZATIONAL DATA MANAGEMENT AND SEGMENTATION IN SAS USING PROC PRINT | PROC SQL | PROC MEANS | PROC FREQ | FOR DATA CLEANING VARIABLE ENRICHMENT CONDITIONAL UPDATES | DATA MERGING AND MISSING VALUE ANALYSIS

ORGANIZATIONAL DATA MANAGEMENT AND SEGMENTATION IN SAS USING PROC PRINT | PROC SQL | PROC MEANS | PROC FREQ | FOR DATA CLEANING VARIABLE ENRICHMENT CONDITIONAL UPDATES | DATA MERGING AND MISSING VALUE ANALYSIS

STEP 1 : CREATING THE DATASET

 DATA NEWDATA;

 INPUT ID NAME:&$:15. AGE CLASSTYPE$ SCHOOLTYPE:$15. OFFICETYPE:$15. DEPTTYPE:$18.;

 DATALINES;

1  Priya Sharma           34  Class_A     Public       Regional     Science

2  Rohan Verma            41  Class_B     Private      Central      Mathematics

3  Meera Pillai           29  Class_C     Public       HeadOffice   Literature

4  Arjun Mehta            38  Class_A     Private      Zonal        Physics

5  Sneha Reddy            31  Class_D     Public       Regional     Computer_Science

6  Kabir Iqbal            45  Class_B     Government   Central      History

7  Isha Rao               36  Class_C     Private      HeadOffice   Biology

8  Nikhil Thomas          40  Class_A     Public       Zonal        Economics

9  Ananya Das             28  Class_D     Government   Regional     Sociology

10 Vinay Kulkarni         33  Class_B     Private      Central      Commerce

;

RUN;

PROC PRINT;RUN;

OUTPUT:

Obs ID NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE
1 1 Priya Sharma 34 Class_A Public Regional Science
2 2 Rohan Verma 41 Class_B Private Central Mathematics
3 3 Meera Pillai 29 Class_C Public HeadOffice Literature
4 4 Arjun Mehta 38 Class_A Private Zonal Physics
5 5 Sneha Reddy 31 Class_D Public Regional Computer_Science
6 6 Kabir Iqbal 45 Class_B Government Central History
7 7 Isha Rao 36 Class_C Private HeadOffice Biology
8 8 Nikhil Thomas 40 Class_A Public Zonal Economics
9 9 Ananya Das 28 Class_D Government Regional Sociology
10 10 Vinay Kulkarni 33 Class_B Private Central Commerce


STEP 2 : USING ASSINGMENT STATEMENT CREATINE VARIABLES

DATA NEW01;

 RETAIN ID FIRST_NAME  LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE;

 SET NEWDATA;

 FIRST_NAME = SCAN(NAME,1);

 LAST_NAME  = SCAN(NAME,2);

 DROP NAME;

 AGE = AGE+1;

RUN;

PROC PRINT;

RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE
1 1 Priya Sharma 35 Class_A Public Regional Science
2 2 Rohan Verma 42 Class_B Private Central Mathematics
3 3 Meera Pillai 30 Class_C Public HeadOffice Literature
4 4 Arjun Mehta 39 Class_A Private Zonal Physics
5 5 Sneha Reddy 32 Class_D Public Regional Computer_Science
6 6 Kabir Iqbal 46 Class_B Government Central History
7 7 Isha Rao 37 Class_C Private HeadOffice Biology
8 8 Nikhil Thomas 41 Class_A Public Zonal Economics
9 9 Ananya Das 29 Class_D Government Regional Sociology
10 10 Vinay Kulkarni 34 Class_B Private Central Commerce


STEP 3 :CREATING 3 DIFFERENT DATASETS FROM ONE DATASET

DATA PRIVATE 

            PUBLIC 

   GOVT;

 SET NEW01;

 IF SCHOOLTYPE = "Private" THEN OUTPUT PRIVATE;

 ELSE IF SCHOOLTYPE = "Public" THEN OUTPUT PUBLIC;

 ELSE OUTPUT GOVT;

RUN;

PROC PRINT DATA=PRIVATE;

RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE
1 2 Rohan Verma 42 Class_B Private Central Mathematics
2 4 Arjun Mehta 39 Class_A Private Zonal Physics
3 7 Isha Rao 37 Class_C Private HeadOffice Biology
4 10 Vinay Kulkarni 34 Class_B Private Central Commerce

PROC PRINT DATA=PUBLIC;

RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE
1 1 Priya Sharma 35 Class_A Public Regional Science
2 3 Meera Pillai 30 Class_C Public HeadOffice Literature
3 5 Sneha Reddy 32 Class_D Public Regional Computer_Science
4 8 Nikhil Thomas 41 Class_A Public Zonal Economics

PROC PRINT DATA=GOVT;

RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE
1 6 Kabir Iqbal 46 Class_B Government Central History
2 9 Ananya Das 29 Class_D Government Regional Sociology


STEP 4 : ALTERING THE TABLE

PROC SQL;

 ALTER TABLE PRIVATE ADD FEES NUM FORMAT = COMMA12.;

QUIT;

PROC PRINT;RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE FEES
1 2 Rohan Verma 42 Class_B Private Central Mathematics .
2 4 Arjun Mehta 39 Class_A Private Zonal Physics .
3 7 Isha Rao 37 Class_C Private HeadOffice Biology .
4 10 Vinay Kulkarni 34 Class_B Private Central Commerce .

PROC SQL;

 ALTER TABLE PUBLIC ADD EMAIL CHAR (30);

RUN;

PROC PRINT;RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE EMAIL
1 1 Priya Sharma 35 Class_A Public Regional Science  
2 3 Meera Pillai 30 Class_C Public HeadOffice Literature  
3 5 Sneha Reddy 32 Class_D Public Regional Computer_Science  
4 8 Nikhil Thomas 41 Class_A Public Zonal Economics  

PROC SQL;

 ALTER TABLE GOVT ADD STATUS CHAR(15);

QUIT;

PROC PRINT;RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE STATUS
1 6 Kabir Iqbal 46 Class_B Government Central History  
2 9 Ananya Das 29 Class_D Government Regional Sociology  


STEP 5 : UPDATING THE ALTERED TABLE 

PROC SQL;

 UPDATE PRIVATE

 SET FEES = CASE

   WHEN ID = 2 THEN 300000

   WHEN ID = 4 THEN 500000

   WHEN ID = 7 THEN 320000

   WHEN ID = 10 THEN 400000

   ELSE FEES

   END;

QUIT;

PROC PRINT;RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE FEES
1 2 Rohan Verma 42 Class_B Private Central Mathematics 300,000
2 4 Arjun Mehta 39 Class_A Private Zonal Physics 500,000
3 7 Isha Rao 37 Class_C Private HeadOffice Biology 320,000
4 10 Vinay Kulkarni 34 Class_B Private Central Commerce 400,000


PROC SQL;

 UPDATE PUBLIC

  SET EMAIL = CASE

    WHEN ID = 1 THEN "Priya.Sharma@gmail.com"

    WHEN ID = 3 THEN "Meera.Pillai@gmail.com"

WHEN ID = 5 THEN "Sneha.Reddy@gmail.com"

WHEN ID = 8 THEN "Nikhil.THomas@gmail.com"

ELSE EMAIL

END;

QUIT;

PROC PRINT;RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE EMAIL
1 1 Priya Sharma 35 Class_A Public Regional Science Priya.Sharma@gmail.com
2 3 Meera Pillai 30 Class_C Public HeadOffice Literature Meera.Pillai@gmail.com
3 5 Sneha Reddy 32 Class_D Public Regional Computer_Science Sneha.Reddy@gmail.com
4 8 Nikhil Thomas 41 Class_A Public Zonal Economics Nikhil.THomas@gmail.com


PROC SQL;

 UPDATE GOVT 

  SET STATUS = CASE

   WHEN ID = 6 THEN "ACTIVE"

   WHEN ID = 9 THEN  "" 

   ELSE STATUS

   END;

QUIT;

PROC PRINT;RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE STATUS
1 6 Kabir Iqbal 46 Class_B Government Central History ACTIVE
2 9 Ananya Das 29 Class_D Government Regional Sociology  


STEP 6: MERGING THE UPDATES TABLES

DATA JOIN;

 MERGE PRIVATE

                PUBLIC

        GOVT;

BY ID;

RUN;

PROC PRINT;RUN;

OUTPUT:

Obs ID FIRST_NAME LAST_NAME AGE CLASSTYPE SCHOOLTYPE OFFICETYPE DEPTTYPE FEES EMAIL STATUS
1 1 Priya Sharma 35 Class_A Public Regional Science . Priya.Sharma@gmail.com  
2 2 Rohan Verma 42 Class_B Private Central Mathematics 300,000    
3 3 Meera Pillai 30 Class_C Public HeadOffice Literature . Meera.Pillai@gmail.com  
4 4 Arjun Mehta 39 Class_A Private Zonal Physics 500,000    
5 5 Sneha Reddy 32 Class_D Public Regional Computer_Science . Sneha.Reddy@gmail.com  
6 6 Kabir Iqbal 46 Class_B Government Central History .   ACTIVE
7 7 Isha Rao 37 Class_C Private HeadOffice Biology 320,000    
8 8 Nikhil Thomas 41 Class_A Public Zonal Economics . Nikhil.THomas@gmail.com  
9 9 Ananya Das 29 Class_D Government Regional Sociology .    
10 10 Vinay Kulkarni 34 Class_B Private Central Commerce 400,000    


STEP 7: MISSING DATA

PROC MEANS DATA = JOIN NMISS;
RUN;

OUTPUT:

The MEANS Procedure

Variable N Miss
ID
AGE
FEES
0
0
6


PROC FREQ DATA = JOIN ;
 TABLES EMAIL / MISSING;
RUN;

OUTPUT:

The FREQ Procedure

EMAIL Frequency Percent Cumulative
Frequency
Cumulative
Percent
  6 60.00 6 60.00
Meera.Pillai@gmail.com 1 10.00 7 70.00
Nikhil.THomas@gmail.com 1 10.00 8 80.00
Priya.Sharma@gmail.com 1 10.00 9 90.00
Sneha.Reddy@gmail.com 1 10.00 10 100.00


PROC FREQ DATA = JOIN ;
 TABLES STATUS / MISSING;
RUN;

OUTPUT:

The FREQ Procedure

STATUS Frequency Percent Cumulative
Frequency
Cumulative
Percent
  9 90.00 9 90.00
ACTIVE 1 10.00 10 100.00



To Visit My Previous Proc  Means And Nway Option:Click Here
To Visit My Previous Proc Means And CharType Option:Click Here
To Visit My Previous SAS Functions:Click Here
To Visit My Previous Length Statement Using In Many Ways:Click Here





--->PLEASE FOLLOW OUR BLOG FOR MORE INFORMATION.
--->PLEASE DO COMMENTS AND SHARE OUR BLOG.

PLEASE FOLLOW OUR TELEGRAM CHANNEL CLICK HERE

PLEASE FOLLOW OUR FACEBOOK PAGE  CLICK HERE

PLEASE FOLLOW OUR INSTAGRAM PAGE CLICK HERE




Comments

Popular posts from this blog

397.If a satellite has excellent signal strength but very high latency, can it still deliver good quality communication? Why or why not?A Sas Study

401.How Efficient Are Global Data Centers? A Complete SAS Analytics Study

383.Which Pharma Distributors Are High-Risk? Can SAS Detect Delays, Temperature Violations, and Fraud?