178.STEP-BY-STEP SDTM DATASET CREATION GUIDE FOR CLINICAL TRIAL PROGRAMMERS USING SAS | UNIQUE CYBERSECURITY-BASED DEMOGRAPHICS EXAMPLE | PROC DATASETS | PROC PRINT | PROC CONTENTS | PROC EXPORT | SAS MACRO AUTOMATION FOR SDTM COMPLIANCE

STEP-BY-STEP SDTM DATASET CREATION GUIDE FOR CLINICAL TRIAL PROGRAMMERS USING SAS | UNIQUE CYBERSECURITY-BASED DEMOGRAPHICS EXAMPLE | PROC DATASETS | PROC PRINT | PROC CONTENTS | PROC EXPORT | SAS MACRO AUTOMATION FOR SDTM COMPLIANCE

/*Let's walk step-by-step through creating a unique SDTM (Study Data Tabulation Model) dataset*/

Step 1: Understand the SDTM DM Dataset

The Demographics (DM) dataset captures subject-level data such as:

Study ID

Subject ID

Site ID

Age, Sex, Race

Country

Date of Birth

Randomization date (RFSTDTC)

Arm assignment

Unique Subject Identifier (USUBJID)


Step 2: Define Variables

We will use the following variables for the DM dataset:

Variable

Label

Description

STUDYID

Study Identifier

"CYBER001"

DOMAIN

Domain Abbreviation

"DM"

USUBJID

Unique Subject Identifier

"CYBER001-SITEID-SUBJID"

SUBJID

Subject Identifier

Subject number

SITEID

Study Site Identifier

e.g., "101", "102"

SEX

Sex

"M" or "F"

AGE

Age

Integer

AGEU

Age Units

"YEARS"

RACE

Race

Standard categories

COUNTRY

Country

e.g., "INDIA", "USA"

ARM

Description of Planned Arm

e.g., "CyberRelax", "Placebo"

RFSTDTC

Subject Reference Start Date/Time

Randomization date in ISO8601


Step 3: Create the Raw Input Data

Placing 12 unique observations in the Raw Dataset. Let’s create an input dataset in SAS.

data raw_dm;

    infile datalines dlm=',' dsd;

    input SUBJID :$3. SITEID :$3. SEX :$1. AGE RACE :$20. COUNTRY :$10. ARM :$15. RFSTDTC :$10.;

    datalines;

001,101,M,28,Asian,INDIA,CyberRelax,2024-06-01

002,101,F,34,White,USA,Placebo,2024-06-03

003,102,M,41,Black,USA,CyberRelax,2024-06-05

004,102,F,29,Asian,INDIA,CyberRelax,2024-06-06

005,101,M,36,White,USA,Placebo,2024-06-07

006,103,F,26,Other,GERMANY,CyberRelax,2024-06-08

007,103,M,33,Asian,INDIA,Placebo,2024-06-09

008,101,F,45,White,UK,CyberRelax,2024-06-10

009,102,M,38,Black,USA,Placebo,2024-06-11

010,103,F,30,Asian,INDIA,CyberRelax,2024-06-12

011,104,M,27,White,CANADA,Placebo,2024-06-13

012,104,F,31,Other,JAPAN,CyberRelax,2024-06-14

;

run;

proc print;run;

Output:

Obs SUBJID SITEID SEX AGE RACE COUNTRY ARM RFSTDTC
1 001 101 M 28 Asian INDIA CyberRelax 2024-06-01
2 002 101 F 34 White USA Placebo 2024-06-03
3 003 102 M 41 Black USA CyberRelax 2024-06-05
4 004 102 F 29 Asian INDIA CyberRelax 2024-06-06
5 005 101 M 36 White USA Placebo 2024-06-07
6 006 103 F 26 Other GERMANY CyberRelax 2024-06-08
7 007 103 M 33 Asian INDIA Placebo 2024-06-09
8 008 101 F 45 White UK CyberRelax 2024-06-10
9 009 102 M 38 Black USA Placebo 2024-06-11
10 010 103 F 30 Asian INDIA CyberRelax 2024-06-12
11 011 104 M 27 White CANADA Placebo 2024-06-13
12 012 104 F 31 Other JAPAN CyberRelax 2024-06-14


Step 4: Define a SAS Macro to Transform to SDTM Format

Add STUDYID as “CYBER001”

Add DOMAIN as “DM”

Create USUBJID = STUDYID || "-" || SITEID || "-" || SUBJID

Add AGEU as “YEARS”


%macro create_sdtm_dm(input=, output=);

    data &output.;

        set &input.;

        length STUDYID $10 DOMAIN $2 USUBJID $25 AGEU $5;

        STUDYID = "CYBER001";

        DOMAIN  = "DM";

        USUBJID = catx("-", STUDYID, SITEID, SUBJID);

        AGEU    = "YEARS";

        label 

            STUDYID  = "Study Identifier"

            DOMAIN   = "Domain Abbreviation"

            USUBJID  = "Unique Subject Identifier"

            SUBJID   = "Subject Identifier for the Study"

            SITEID   = "Study Site Identifier"

            SEX      = "Sex"

            AGE      = "Age"

            AGEU     = "Age Units"

            RACE     = "Race"

            COUNTRY  = "Country"

            ARM      = "Description of Planned Arm"

            RFSTDTC  = "Subject Reference Start Date/Time";

    run;

    proc print;run;

%mend;

 Step 5: Execute the Macro

 %create_sdtm_dm(input=raw_dm, output=sdtm_dm);

Output:

Obs SUBJID SITEID SEX AGE RACE COUNTRY ARM RFSTDTC STUDYID DOMAIN USUBJID AGEU
1 001 101 M 28 Asian INDIA CyberRelax 2024-06-01 CYBER001 DM CYBER001-101-001 YEARS
2 002 101 F 34 White USA Placebo 2024-06-03 CYBER001 DM CYBER001-101-002 YEARS
3 003 102 M 41 Black USA CyberRelax 2024-06-05 CYBER001 DM CYBER001-102-003 YEARS
4 004 102 F 29 Asian INDIA CyberRelax 2024-06-06 CYBER001 DM CYBER001-102-004 YEARS
5 005 101 M 36 White USA Placebo 2024-06-07 CYBER001 DM CYBER001-101-005 YEARS
6 006 103 F 26 Other GERMANY CyberRelax 2024-06-08 CYBER001 DM CYBER001-103-006 YEARS
7 007 103 M 33 Asian INDIA Placebo 2024-06-09 CYBER001 DM CYBER001-103-007 YEARS
8 008 101 F 45 White UK CyberRelax 2024-06-10 CYBER001 DM CYBER001-101-008 YEARS
9 009 102 M 38 Black USA Placebo 2024-06-11 CYBER001 DM CYBER001-102-009 YEARS
10 010 103 F 30 Asian INDIA CyberRelax 2024-06-12 CYBER001 DM CYBER001-103-010 YEARS
11 011 104 M 27 White CANADA Placebo 2024-06-13 CYBER001 DM CYBER001-104-011 YEARS
12 012 104 F 31 Other JAPAN CyberRelax 2024-06-14 CYBER001 DM CYBER001-104-012 YEARS

After execution, the sdtm_dm dataset will contain transformed and SDTM-compliant data.


Step 6: Review the SDTM Dataset


proc print data=sdtm_dm label noobs;

run;

Output:

Subject
Identifier
for the
Study
Study Site
Identifier
Sex Age Race Country Description of
Planned Arm
Subject Reference
Start Date/Time
Study Identifier Domain
Abbreviation
Unique Subject Identifier Age Units
001 101 M 28 Asian INDIA CyberRelax 2024-06-01 CYBER001 DM CYBER001-101-001 YEARS
002 101 F 34 White USA Placebo 2024-06-03 CYBER001 DM CYBER001-101-002 YEARS
003 102 M 41 Black USA CyberRelax 2024-06-05 CYBER001 DM CYBER001-102-003 YEARS
004 102 F 29 Asian INDIA CyberRelax 2024-06-06 CYBER001 DM CYBER001-102-004 YEARS
005 101 M 36 White USA Placebo 2024-06-07 CYBER001 DM CYBER001-101-005 YEARS
006 103 F 26 Other GERMANY CyberRelax 2024-06-08 CYBER001 DM CYBER001-103-006 YEARS
007 103 M 33 Asian INDIA Placebo 2024-06-09 CYBER001 DM CYBER001-103-007 YEARS
008 101 F 45 White UK CyberRelax 2024-06-10 CYBER001 DM CYBER001-101-008 YEARS
009 102 M 38 Black USA Placebo 2024-06-11 CYBER001 DM CYBER001-102-009 YEARS
010 103 F 30 Asian INDIA CyberRelax 2024-06-12 CYBER001 DM CYBER001-103-010 YEARS
011 104 M 27 White CANADA Placebo 2024-06-13 CYBER001 DM CYBER001-104-011 YEARS
012 104 F 31 Other JAPAN CyberRelax 2024-06-14 CYBER001 DM CYBER001-104-012 YEARS


Step 7: SDTM Compliance Checks

7.1 Variable Order

%macro reorder_dm(dataset=, out=);

    data &out.;

    retain STUDYID DOMAIN USUBJID SUBJID SITEID SEX AGE AGEU RACE   COUNTRY               ARM RFSTDTC;

        set &dataset.;

    run;

   proc print;run;

%mend;


%reorder_dm(dataset=sdtm_dm, out=sdtm_dm_final);

Output:

Obs STUDYID DOMAIN USUBJID SUBJID SITEID SEX AGE AGEU RACE COUNTRY ARM RFSTDTC
1 CYBER001 DM CYBER001-101-001 001 101 M 28 YEARS Asian INDIA CyberRelax 2024-06-01
2 CYBER001 DM CYBER001-101-002 002 101 F 34 YEARS White USA Placebo 2024-06-03
3 CYBER001 DM CYBER001-102-003 003 102 M 41 YEARS Black USA CyberRelax 2024-06-05
4 CYBER001 DM CYBER001-102-004 004 102 F 29 YEARS Asian INDIA CyberRelax 2024-06-06
5 CYBER001 DM CYBER001-101-005 005 101 M 36 YEARS White USA Placebo 2024-06-07
6 CYBER001 DM CYBER001-103-006 006 103 F 26 YEARS Other GERMANY CyberRelax 2024-06-08
7 CYBER001 DM CYBER001-103-007 007 103 M 33 YEARS Asian INDIA Placebo 2024-06-09
8 CYBER001 DM CYBER001-101-008 008 101 F 45 YEARS White UK CyberRelax 2024-06-10
9 CYBER001 DM CYBER001-102-009 009 102 M 38 YEARS Black USA Placebo 2024-06-11
10 CYBER001 DM CYBER001-103-010 010 103 F 30 YEARS Asian INDIA CyberRelax 2024-06-12
11 CYBER001 DM CYBER001-104-011 011 104 M 27 YEARS White CANADA Placebo 2024-06-13
12 CYBER001 DM CYBER001-104-012 012 104 F 31 YEARS Other JAPAN CyberRelax 2024-06-14


7.2 Length Check

proc contents data=sdtm_dm_final;

run;

Output:

                                                               The CONTENTS Procedure

Data Set Name WORK.SDTM_DM_FINAL Observations 12
Member Type DATA Variables 12
Engine V9 Indexes 0
Created 14/09/2015 00:24:22 Observation Length 112
Last Modified 14/09/2015 00:24:22 Deleted Observations 0
Protection   Compressed NO
Data Set Type   Sorted NO
Label      
Data Representation WINDOWS_64    
Encoding wlatin1 Western (Windows)    


Engine/Host Dependent Information
Data Set Page Size 65536
Number of Data Set Pages 1
First Data Page 1
Max Obs per Page 584
Obs in First Data Page 12
Number of Data Set Repairs 0
ExtendObsCounter YES
Filename C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD1524_DESKTOP-QFAA4KV_\sdtm_dm_final.sas7bdat
Release Created 9.0401M2
Host Created X64_8HOME


Alphabetic List of Variables and Attributes
# Variable Type Len Label
7 AGE Num 8 Age
8 AGEU Char 5 Age Units
11 ARM Char 15 Description of Planned Arm
10 COUNTRY Char 10 Country
2 DOMAIN Char 2 Domain Abbreviation
9 RACE Char 20 Race
12 RFSTDTC Char 10 Subject Reference Start Date/Time
6 SEX Char 1 Sex
5 SITEID Char 3 Study Site Identifier
1 STUDYID Char 10 Study Identifier
4 SUBJID Char 3 Subject Identifier for the Study
3 USUBJID Char 25 Unique Subject Identifier


Step 8: Export the Dataset to XPORT Format


XPT is used for FDA submission.


libname outlib xport 'path/sdtm_dm_final.xpt';


data outlib.DM;

    set sdtm_dm_final;

run;


Overall Summary:

1.Defined the goal: create a unique SDTM-compliant DM dataset.

2.Created raw input with realistic, cybersecurity-themed trial data.

3.Wrote a macro to automate transformation to SDTM format.

4.Executed and validated the macro.

5.Reordered variables for CDISC compliance.

6.Exported the dataset to an XPORT format.


PRACTICE AND COMMENT YOUR CODE: 

--->PLEASE FOLLOW OUR BLOG FOR MORE UPDATES.

PLEASE FOLLOW OUR TELEGRAM CHANNEL CLICK HERE

PLEASE FOLLOW OUR FACEBOOK PAGE  CLICK HERE

PLEASE FOLLOW OUR INSTAGRAM PAGE CLICK HERE

Comments