6.SUBSETTING OBSERVATIONS AND VARIABLES

                SUBSETTING/CONTROLLING OBSERVATIONS AND VARIABLE


FOR OBSERVATIONS:::TO SUBSET OBSERVATION THERE ARE STATEMENTS LIKE 

         1.WHERE

  2.IF THEN 

  3.IF THEN ELSE

  4.FIRST OBS AND OBS

  5.FIRST. AND LAST.      ETC


FOR VARIABLES::::TO SUBSET VARIABLES THERE ARE STATEMENTS LIKE

        1.KEEP:INCLUDING THE VARIABLES

2.DROP:EXCLUDING THE VARIABLES


 1.WHERE:

 -->WE CAN USE CHARACTER AND NUMERIC DATA

 -->WHERE DATA WILL BE 'CASE SENSITIVE'(A a)

 -->FOR CHARACTER OBSERVATION NEED '    '  (QUOTES)

 -->FOR NUMERIC NO NEED TO GIVE ANY QUOTES

 --> WHERE STATEMENT WILL WORK IN 'DATA' STEP AND 'PROC'STEP*/;

EG1:

DATA SAS;

SET SASHELP.CLASS;

WHERE SEX='M';

RUN;

PROC PRINT DATA=SAS;

RUN;

--->WHERE STATEMENT IS CASE SENSITIVE SO IF ANY CASE SENSITIVE CODE IN SYNTAX IT WILL SHOW ERROR.

EG2:

DATA SAS;

SET SASHELP.CLASS;

WHERE SEX='m';

RUN;

PROC PRINT DATA=SAS;

RUN;


WHERE OPERATORS::THERE ARE OPERATORS LIKE

1.COMPARISON OPERATOR:

  --EQUAL                                      EQ            =

  --NOT EQUAL                             NE          ^=

  --GREATER THAN                     GT            >

  --LESSER THAN                        LT             <

  --GREATER THAN EQUAL      GE            >=

  --LESSER THEN  EQUAL          LE           <=

  --  IN OPERATOR


2.ARTHEMATIC OPERATOR:


---   *   MULTIPLICATION

---   /   DIVISION

---   +   ADDITION

---   -   SUBSTRACTION


3.LOGICAL OPERATOR::


--    &  AND

--    |  OR

--   ^  NOT


4.SPECIAL WHERE OPERATOR::


--BETWEEN-AND

--IS NULL 

--IS MISSING

--CONTAINS


1.COMPARISON OPERATOR:

EG1:

DATA SAS1111;

SET SASHELP.CLASS;

WHERE sex="F";

RUN;

EG2:

DATA SAS1;

SET SASHELP.CLASS;

WHERE AGE >14;

RUN;

PROC PRINT;

RUN;

EG3:

DATA SAS2;

SET SASHELP.CLASS;

WHERE AGE <14;

RUN;

PROC PRINT;

RUN;

EG4:

DATA SAS3;

SET SASHELP.CLASS;

WHERE AGE <=14;

RUN;

PROC PRINT;

RUN;

EG5:

DATA SALES;

SET PROG1.SALES;

WHERE COUNTRY IN('AU','IN');

RUN;

PROC PRINT;

RUN;

EG6:

DATA SALES;

SET PROG1.SALES;

WHERE SALARY NE .;

RUN;

PROC PRINT;

RUN;


2.ARTHEMATIC OPERATOR:

EG1:

DATA SALE;

SET PROG1.SALES;

WHERE SALARY / 12 *1.10 >=7500;

RUN;

PROC PRINT;

RUN;

EG2:

DATA SALE2;

SET PROG1.SALES;

WHERE (SALARY / 12) *1.10 >=7500; 

RUN;

PROC PRINT;

RUN;


3.LOGICAL OPERATOR::

EG1:

DATA SALE22;

SET PROG1.SALES;

WHERE GENDER NE 'M' AND SALARY >=50000; 

RUN;

PROC PRINT;

RUN;

EG2:

DATA SALE33;

SET PROG1.SALES;

WHERE GENDER NE 'F' OR SALARY >=50000; 

RUN;

PROC PRINT;

RUN;

EG3:

DATA SALE;

SET PROG1.SALES;

WHERE COUNTRY='US' OR COUNTRY='AU'; 

RUN;

PROC PRINT;

RUN;

EG4:

DATA SALE44;

SET PROG1.SALES;

WHERE COUNTRY NOT IN ('US'); 

RUN;

PROC PRINT;

RUN;


4.SPECIAL WHERE OPERATOR::

EG1:

DATA SALE66;

SET PROG1.SALES;

WHERE SALARY BETWEEN 50000 AND 100000;

RUN;

PROC PRINT;

RUN;

EG2:

DATA SALE55;

SET PROG1.SALES;

WHERE SALARY NOT BETWEEN 50000 AND 100000; 

RUN;

PROC PRINT;

RUN;

EG3:

DATA SALE77;

SET PROG1.SALES;

WHERE SALARY IS NULL; 

RUN;

PROC PRINT;

RUN;

EG4:

DATA SALE775;

SET PROG1.SALES;

WHERE GENDER IS MISSING; 

RUN;

EG5:

DATA SALE123;

SET PROG1.SALES;

WHERE JOB_TITLE CONTAINS 'Manager'; 

RUN;

PROC PRINT;

RUN;

EG6:

DATA SALE90;

SET PROG1.SALES;

WHERE JOB_TITLE CONTAINS 'Officer'; 

RUN;

EG7:

DATA SALE901;

SET PROG1.SALES;

WHERE JOB_TITLE CONTAINS 'Rep'; 

RUN;


FOR VARIABLES::::TO SUBSET THE VARIABLES THERE ARE STATEMENTS LIKE

          1.KEEP:INCLUDING THE VARIABLES

  2.DROP:EXCLUDING THE VARIABLES


HERE KEEP AND DROP WILL WORK IN DATA STEP NOT IN 'PROC STEP'


1.KEEP:INCLUDING THE VARIABLES

EG1:

DATA SAS125;

SET SASHELP.CLASS;

KEEP NAME AGE;

RUN;

EG2:

DATA SAS09;

SET SASHELP.CLASS;

drop NAME AGE;

RUN;

PROC PRINT;

RUN;


2.DROP:EXCLUDING THE VARIABLES

EG1:

DATA SAS301;

SET SASHELP.CARS;

DROP MAKE MODEL MSRP;

RUN;

EG2:

DATA SAS322;

SET SASHELP.CARS;

keep MAKE MODEL MSRP;

RUN;

PROC PRINT;

RUN;

EG3:

DATA WW;

SET SASHELP.CLASS;

KEEP AGE SEX;

WHERE AGE > 13;

RUN;

EG4:

DATA EWE;

KEEP       N R U;

INPUT     N R U I O Y P Y;

CARDS;

1 2 3 4 5 6 7 8

9 8 7 6 5 4 3 2

1 2 3 4 5 6 7 8

9 8 7 6 5 4 3 2

1 2 3 4 5 6 7 8

9 8 7 6 5 4 3 2

1 2 3 4 5 6 7 8

9 8 7 6 5 4 3 2

;

RUN;



********  statments are working in DATASTEP AND PROC STEP;


DATA SS;

SET;

INFILE;

INPUT;

KEEP;

DROP;

WHERE;

RUN;

more......


PROC PRINT;

SET;

INFILE;

INPUT;

KEEP;

DROP;

WHERE;

VAR;

RUN;


PROC PRINT DATA=SASHELP.CLASS;

VAR AGE SEX;

WHERE AGE > 15;

RUN;




PROC PRINT DATA=SASHELP.HEART;

VAR SYSTOLIC DIASTOLIC;

WHERE DIASTOLIC > 89;

RUN;


Comments