From Raw Trial Records to ADaM Intelligence: Building High-Impact ADSL and ADAE Datasets for Clinical Analytics
Transforming DM, EX, and AE Domains into Submission-Ready ADaM Datasets
Introduction
In clinical trials, raw datasets collected from different domains are
usually inconsistent, incomplete, and not directly suitable for statistical
analysis. To perform accurate safety and efficacy analysis, the raw data must
be transformed into standardized analysis-ready datasets following CDISC ADaM
guidelines. In this project, the DM (Demographics), EX (Exposure), and AE
(Adverse Events) datasets were used to create two important ADaM datasets known
as ADSL and ADAE.
The ADSL dataset acts as the backbone of all analysis datasets because it
contains one record per subject with important treatment and population-level
information. Variables such as treatment start date, treatment end date, safety
population flag, intent-to-treat flag, and death flag were derived using
exposure and adverse event information. The ADAE dataset was developed to
analyze adverse events occurring during the clinical trial. Important variables
such as severity flags, serious adverse event flags, treatment emergent flags,
and relationship flags were derived after cleaning and standardizing the AE
domain.
This project demonstrates real-world clinical SAS programming concepts
including data cleaning, date conversions, duplicate removal, merging datasets,
derivation logic, standardization techniques, and analysis-ready dataset
creation. It also reflects the practical workflow followed by SAS programmers
in pharmaceutical and CRO industries during clinical trial submissions and
regulatory reporting.
/*DM→ Demographics — Understanding DM Dataset*/
DATA dm;
LENGTH STUDYID $10 USUBJID $25 SEX $10 RACE $20
ARM $20 RFSTDTC $15 RFENDTC $15;
INFILE DATALINES DLM='|' DSD TRUNCOVER;
INPUT STUDYID :$10. USUBJID :$25. SITEID SEX :$10.
RACE :$20. ARM :$20. RFSTDTC :$15. RFENDTC :$15. AGE;
DATALINES;
ABC123|ABC123-101|1001|Male|Asian|Drug A|01JAN2025|31JAN2025|45
ABC123|ABC123-102|1001|FEMALE|White|PLACEBO|02/15/2025||52
ABC123|ABC123-103|1002|m|Black|drug b|01MAR2025|15MAR2025|38
ABC123|ABC123-104|1002|f|Asian|Drug A|04-01-2025||.
ABC123|ABC123-105|1003|MALE|White|Placebo|2025-05-01|2025-05-20|60
ABC123|ABC123-106|1003|Female|Asian|DRUG B|06/10/2025|06/25/2025|35
ABC123|ABC123-107|1004|M|Black|Drug A|07-01-2025||48
ABC123|ABC123-108|1004|F|Asian|placebo|2025-08-20|2025-08-30|29
ABC123|ABC123-109|1005|unknown|White|Drug B|09SEP2025|15SEP2025|55
ABC123|ABC123-110|1005|male|Asian|DRUG A|2025/10/01||42
ABC123|ABC123-111|1006| FEMALE |Black|Placebo|11NOV2025|15NOV2025|50
ABC123|ABC123-112|1006|m|White|Drug B|2025-12-01|2025-12-10|39
ABC123|ABC123-113|1007|Female|Asian|drug a|01/03/2025||33
ABC123|ABC123-114|1007|M|Black|PLACEBO|14FEB2025|18FEB2025|.
ABC123|ABC123-115|1008|f|White|Drug B|2025-03-15|2025-03-20|41
ABC123|ABC123-116|1008|Male|Asian|Drug A|04APR2025||58
ABC123|ABC123-117|1009|FEMALE|Black|Placebo|2025-05-20|2025-05-25|47
ABC123|ABC123-118|1009|m|Asian|drug b|06/06/2025||36
ABC123|ABC123-119|1010|Female|White|Drug A|07JUL2025|15JUL2025|44
ABC123|ABC123-120|1010|Male|Asian|PLACEBO|08AUG2025||49
ABC123|ABC123-101|1001|Male|Asian|Drug A|01JAN2025|31JAN2025|45
;
RUN;
PROC PRINT DATA=dm;
RUN;
OUTPUT:
| Obs | STUDYID | USUBJID | SEX | RACE | ARM | RFSTDTC | RFENDTC | SITEID | AGE |
|---|---|---|---|---|---|---|---|---|---|
| 1 | ABC123 | ABC123-101 | Male | Asian | Drug A | 01JAN2025 | 31JAN2025 | 1001 | 45 |
| 2 | ABC123 | ABC123-102 | FEMALE | White | PLACEBO | 02/15/2025 | 1001 | 52 | |
| 3 | ABC123 | ABC123-103 | m | Black | drug b | 01MAR2025 | 15MAR2025 | 1002 | 38 |
| 4 | ABC123 | ABC123-104 | f | Asian | Drug A | 04-01-2025 | 1002 | . | |
| 5 | ABC123 | ABC123-105 | MALE | White | Placebo | 2025-05-01 | 2025-05-20 | 1003 | 60 |
| 6 | ABC123 | ABC123-106 | Female | Asian | DRUG B | 06/10/2025 | 06/25/2025 | 1003 | 35 |
| 7 | ABC123 | ABC123-107 | M | Black | Drug A | 07-01-2025 | 1004 | 48 | |
| 8 | ABC123 | ABC123-108 | F | Asian | placebo | 2025-08-20 | 2025-08-30 | 1004 | 29 |
| 9 | ABC123 | ABC123-109 | unknown | White | Drug B | 09SEP2025 | 15SEP2025 | 1005 | 55 |
| 10 | ABC123 | ABC123-110 | male | Asian | DRUG A | 2025/10/01 | 1005 | 42 | |
| 11 | ABC123 | ABC123-111 | FEMALE | Black | Placebo | 11NOV2025 | 15NOV2025 | 1006 | 50 |
| 12 | ABC123 | ABC123-112 | m | White | Drug B | 2025-12-01 | 2025-12-10 | 1006 | 39 |
| 13 | ABC123 | ABC123-113 | Female | Asian | drug a | 01/03/2025 | 1007 | 33 | |
| 14 | ABC123 | ABC123-114 | M | Black | PLACEBO | 14FEB2025 | 18FEB2025 | 1007 | . |
| 15 | ABC123 | ABC123-115 | f | White | Drug B | 2025-03-15 | 2025-03-20 | 1008 | 41 |
| 16 | ABC123 | ABC123-116 | Male | Asian | Drug A | 04APR2025 | 1008 | 58 | |
| 17 | ABC123 | ABC123-117 | FEMALE | Black | Placebo | 2025-05-20 | 2025-05-25 | 1009 | 47 |
| 18 | ABC123 | ABC123-118 | m | Asian | drug b | 06/06/2025 | 1009 | 36 | |
| 19 | ABC123 | ABC123-119 | Female | White | Drug A | 07JUL2025 | 15JUL2025 | 1010 | 44 |
| 20 | ABC123 | ABC123-120 | Male | Asian | PLACEBO | 08AUG2025 | 1010 | 49 | |
| 21 | ABC123 | ABC123-101 | Male | Asian | Drug A | 01JAN2025 | 31JAN2025 | 1001 | 45 |
Explanation
This
starts creation of the DM dataset.
DM stands
for Demographics.
It
contains:
- Subject information
- Sex
- Race
- Treatment Arm
- Study start/end dates
- Age
DM is
considered the parent dataset because almost every analysis dataset
links back to DM.
The
LENGTH statement defines:
- Variable type
- Storage size
$ means
character variable.
Example:
- STUDYID $10 → Character
variable with maximum length 10.
Why
important?
- Prevents truncation
- Controls memory
- Maintains consistency
This
tells SAS how to read raw data.
|
Option |
Purpose |
|
DLM=' |
' |
|
DSD |
Handles
missing values properly |
|
TRUNCOVER |
Prevents
SAS from reading next line when values missing |
Very
common in raw clinical datasets.
Defines
how variables are read.
:
modifier allows flexible reading.
Example:
SEX :$10.
means:
- Read character
- Maximum width 10
About Important Variables
|
Variable |
Meaning |
|
STUDYID |
Clinical
study identifier |
|
USUBJID |
Unique
subject ID |
|
SITEID |
Site
number |
|
SEX |
Subject
gender |
|
RACE |
Race
category |
|
ARM |
Treatment
arm |
|
RFSTDTC |
Reference
Start Date |
|
RFENDTC |
Reference
End Date |
|
AGE |
Subject
age |
Why Data Contains Issues
You
intentionally included:
- Duplicate records
- Missing values
- Different date formats
- Inconsistent capitalization
This
mimics real-world clinical data.
Examples:
|
Problem |
Example |
|
Duplicate
subject |
ABC123-101 |
|
Missing
age |
. |
|
Different
sex formats |
Male,
MALE, m |
|
Different
date formats |
01JAN2025,
2025-01-01, 02/15/2025 |
This is
excellent interview-level practice.
Displays
dataset contents.
Used for:
- QC checks
- Validation
- Debugging
/*EX→ Exposure — Understanding EX Dataset*/
DATA ex;
LENGTH USUBJID $25 EXSTDTC $15 EXENDTC $15;
INFILE DATALINES DLM='|' DSD TRUNCOVER;
INPUT USUBJID :$25. EXSTDTC :$15. EXENDTC :$15. EXDOSE;
DATALINES;
ABC123-101|01JAN2025|05JAN2025|100
ABC123-101|2025-01-10|2025-01-15|100
ABC123-102|02/15/2025||200
ABC123-102|2025-02-20|2025-02-25|200
ABC123-103|01MAR2025|05MAR2025|150
ABC123-103|2025/03/10|2025/03/15|150
ABC123-104|04-01-2025||120
ABC123-104|15APR2025|20APR2025|120
ABC123-105|2025-05-01|2025-05-10|100
ABC123-105|2025-05-15||100
ABC123-106|06/10/2025|06/15/2025|80
ABC123-106|06/20/2025|06/25/2025|80
ABC123-107|07-01-2025||90
ABC123-108|2025-08-20|2025-08-30|110
ABC123-109|09SEP2025|15SEP2025|130
ABC123-110|2025/10/01||140
ABC123-111|11NOV2025|15NOV2025|150
ABC123-112|2025-12-01|2025-12-10|160
ABC123-113|01/03/2025||170
ABC123-114|14FEB2025|18FEB2025|180
ABC123-115|2025-03-15|2025-03-20|190
ABC123-101|01JAN2025|05JAN2025|100
;
RUN;
PROC PRINT DATA=ex;
RUN;
OUTPUT:
| Obs | USUBJID | EXSTDTC | EXENDTC | EXDOSE |
|---|---|---|---|---|
| 1 | ABC123-101 | 01JAN2025 | 05JAN2025 | 100 |
| 2 | ABC123-101 | 2025-01-10 | 2025-01-15 | 100 |
| 3 | ABC123-102 | 02/15/2025 | 200 | |
| 4 | ABC123-102 | 2025-02-20 | 2025-02-25 | 200 |
| 5 | ABC123-103 | 01MAR2025 | 05MAR2025 | 150 |
| 6 | ABC123-103 | 2025/03/10 | 2025/03/15 | 150 |
| 7 | ABC123-104 | 04-01-2025 | 120 | |
| 8 | ABC123-104 | 15APR2025 | 20APR2025 | 120 |
| 9 | ABC123-105 | 2025-05-01 | 2025-05-10 | 100 |
| 10 | ABC123-105 | 2025-05-15 | 100 | |
| 11 | ABC123-106 | 06/10/2025 | 06/15/2025 | 80 |
| 12 | ABC123-106 | 06/20/2025 | 06/25/2025 | 80 |
| 13 | ABC123-107 | 07-01-2025 | 90 | |
| 14 | ABC123-108 | 2025-08-20 | 2025-08-30 | 110 |
| 15 | ABC123-109 | 09SEP2025 | 15SEP2025 | 130 |
| 16 | ABC123-110 | 2025/10/01 | 140 | |
| 17 | ABC123-111 | 11NOV2025 | 15NOV2025 | 150 |
| 18 | ABC123-112 | 2025-12-01 | 2025-12-10 | 160 |
| 19 | ABC123-113 | 01/03/2025 | 170 | |
| 20 | ABC123-114 | 14FEB2025 | 18FEB2025 | 180 |
| 21 | ABC123-115 | 2025-03-15 | 2025-03-20 | 190 |
| 22 | ABC123-101 | 01JAN2025 | 05JAN2025 | 100 |
Explanation
Creates
Exposure dataset.
EX domain
stores:
- Drug administration
information
- Dose dates
- Dose amount
Important Variables
|
Variable |
Meaning |
|
EXSTDTC |
Exposure
Start Date |
|
EXENDTC |
Exposure
End Date |
|
EXDOSE |
Dose
Amount |
Why EX is Important
EX is
used to derive:
|
Derived
Variable |
Meaning |
|
TRTSDT |
First
treatment date |
|
TRTEDT |
Last
treatment date |
These are
critical in:
- Safety analysis
- Efficacy analysis
- Treatment emergent
calculations
/*AE→ Adverse Events — Understanding AE Dataset*/
DATA ae;
LENGTH USUBJID $25 AETERM $30 AESTDTC $15 AEENDTC $15
AESEV $15 AESER $10 AEREL $20;
INFILE DATALINES DLM='|' DSD TRUNCOVER;
INPUT USUBJID :$25. AETERM :$30. AESTDTC :$15. AEENDTC :$15.
AESEV :$15. AESER :$10. AEREL :$20.;
DATALINES;
ABC123-101|Headache|01JAN2025|05JAN2025|Mild|Y|Related
ABC123-101|Fever|2025-01-20|2025-01-25|SEVERE|YES|RELATED
ABC123-102|Vomiting|02/10/2025||Moderate|N|Not Related
ABC123-102|Cold|2025-02-28|2025-03-02| mild |NO|Related
ABC123-103|Headache|01MAR2025|05MAR2025|Severe|Y|RELATED
ABC123-103|Fever|2025/03/20||MODERATE|N|Not Related
ABC123-104|Vomiting|03-25-2025|04-01-2025|Mild|No|Related
ABC123-104|Cold|15APR2025|| severe |YES|RELATED
ABC123-105|Headache|2025-04-20|2025-04-22|Moderate|N|Not Related
ABC123-105|Fever|2025-05-18||SEVERE|Y|Related
ABC123-106|Vomiting|06/05/2025|06/08/2025| mild |NO|RELATED
ABC123-106|Cold|06/25/2025||Moderate|YES|Not Related
ABC123-107|Headache|06-25-2025|07-02-2025|SEVERE|Y|Related
ABC123-108|Fever|2025-08-15||Mild|No|RELATED
ABC123-109|Vomiting|09SEP2025|12SEP2025| severe |YES|Related
ABC123-110|Cold|2025/09/25||Moderate|N|Not Related
ABC123-111|Headache|11NOV2025|15NOV2025|MILD|NO|RELATED
ABC123-112|Fever|2025-12-05||Severe|Y|Related
ABC123-113|Vomiting|01/02/2025|01/05/2025|Moderate|YES|Not Related
ABC123-114|Cold|10FEB2025|| mild |N|RELATED
ABC123-115|Headache|2025-03-10|2025-03-15|SEVERE|YES|Related
ABC123-101|Headache|01JAN2025|05JAN2025|Mild|Y|Related
;
RUN;
PROC PRINT DATA=ae;
RUN;
OUTPUT:
| Obs | USUBJID | AETERM | AESTDTC | AEENDTC | AESEV | AESER | AEREL |
|---|---|---|---|---|---|---|---|
| 1 | ABC123-101 | Headache | 01JAN2025 | 05JAN2025 | Mild | Y | Related |
| 2 | ABC123-101 | Fever | 2025-01-20 | 2025-01-25 | SEVERE | YES | RELATED |
| 3 | ABC123-102 | Vomiting | 02/10/2025 | Moderate | N | Not Related | |
| 4 | ABC123-102 | Cold | 2025-02-28 | 2025-03-02 | mild | NO | Related |
| 5 | ABC123-103 | Headache | 01MAR2025 | 05MAR2025 | Severe | Y | RELATED |
| 6 | ABC123-103 | Fever | 2025/03/20 | MODERATE | N | Not Related | |
| 7 | ABC123-104 | Vomiting | 03-25-2025 | 04-01-2025 | Mild | No | Related |
| 8 | ABC123-104 | Cold | 15APR2025 | severe | YES | RELATED | |
| 9 | ABC123-105 | Headache | 2025-04-20 | 2025-04-22 | Moderate | N | Not Related |
| 10 | ABC123-105 | Fever | 2025-05-18 | SEVERE | Y | Related | |
| 11 | ABC123-106 | Vomiting | 06/05/2025 | 06/08/2025 | mild | NO | RELATED |
| 12 | ABC123-106 | Cold | 06/25/2025 | Moderate | YES | Not Related | |
| 13 | ABC123-107 | Headache | 06-25-2025 | 07-02-2025 | SEVERE | Y | Related |
| 14 | ABC123-108 | Fever | 2025-08-15 | Mild | No | RELATED | |
| 15 | ABC123-109 | Vomiting | 09SEP2025 | 12SEP2025 | severe | YES | Related |
| 16 | ABC123-110 | Cold | 2025/09/25 | Moderate | N | Not Related | |
| 17 | ABC123-111 | Headache | 11NOV2025 | 15NOV2025 | MILD | NO | RELATED |
| 18 | ABC123-112 | Fever | 2025-12-05 | Severe | Y | Related | |
| 19 | ABC123-113 | Vomiting | 01/02/2025 | 01/05/2025 | Moderate | YES | Not Related |
| 20 | ABC123-114 | Cold | 10FEB2025 | mild | N | RELATED | |
| 21 | ABC123-115 | Headache | 2025-03-10 | 2025-03-15 | SEVERE | YES | Related |
| 22 | ABC123-101 | Headache | 01JAN2025 | 05JAN2025 | Mild | Y | Related |
Explanation
Creates
Adverse Events dataset.
AE domain
stores:
- Medical events
- Severity
- Seriousness
- Relationship to drug
Important Variables
|
Variable |
Meaning |
|
AETERM |
Adverse
Event Term |
|
AESTDTC |
AE
Start Date |
|
AEENDTC |
AE End
Date |
|
AESEV |
Severity |
|
AESER |
Serious
Event Flag |
|
AEREL |
Relationship
to treatment |
Why AE is Critical
AE data
is the most important safety domain.
Used in:
- Regulatory submissions
- Safety review
- SAE reporting
- CSR tables
Creating ADSL Dataset
ADSL =
Subject-Level Analysis Dataset
ADSL is
the backbone of all ADaM datasets.
/*Step 1: Prepare Exposure Dates*/
/*Derive first dose date (TRTSDT) and last dose date (TRTEDT) from EX.*/
PROC SORT DATA=EX;
BY USUBJID;
RUN;
PROC PRINT DATA=EX;
RUN;
OUTPUT:
| Obs | USUBJID | EXSTDTC | EXENDTC | EXDOSE |
|---|---|---|---|---|
| 1 | ABC123-101 | 01JAN2025 | 05JAN2025 | 100 |
| 2 | ABC123-101 | 2025-01-10 | 2025-01-15 | 100 |
| 3 | ABC123-101 | 01JAN2025 | 05JAN2025 | 100 |
| 4 | ABC123-102 | 02/15/2025 | 200 | |
| 5 | ABC123-102 | 2025-02-20 | 2025-02-25 | 200 |
| 6 | ABC123-103 | 01MAR2025 | 05MAR2025 | 150 |
| 7 | ABC123-103 | 2025/03/10 | 2025/03/15 | 150 |
| 8 | ABC123-104 | 04-01-2025 | 120 | |
| 9 | ABC123-104 | 15APR2025 | 20APR2025 | 120 |
| 10 | ABC123-105 | 2025-05-01 | 2025-05-10 | 100 |
| 11 | ABC123-105 | 2025-05-15 | 100 | |
| 12 | ABC123-106 | 06/10/2025 | 06/15/2025 | 80 |
| 13 | ABC123-106 | 06/20/2025 | 06/25/2025 | 80 |
| 14 | ABC123-107 | 07-01-2025 | 90 | |
| 15 | ABC123-108 | 2025-08-20 | 2025-08-30 | 110 |
| 16 | ABC123-109 | 09SEP2025 | 15SEP2025 | 130 |
| 17 | ABC123-110 | 2025/10/01 | 140 | |
| 18 | ABC123-111 | 11NOV2025 | 15NOV2025 | 150 |
| 19 | ABC123-112 | 2025-12-01 | 2025-12-10 | 160 |
| 20 | ABC123-113 | 01/03/2025 | 170 | |
| 21 | ABC123-114 | 14FEB2025 | 18FEB2025 | 180 |
| 22 | ABC123-115 | 2025-03-15 | 2025-03-20 | 190 |
Why?
Sorting
is mandatory before:
- BY-group processing
- MERGE
- FIRST./LAST. logic
Without
sorting:
- SAS gives errors
- Merge becomes incorrect
/*Step 2: Convert Character
Dates to SAS Dates*/
DATA EX_DATES;
SET EX;
EXSTDT = INPUT(EXSTDTC,ANYDTDTE15.);
EXENDT = INPUT(EXENDTC,ANYDTDTE15.);
FORMAT EXSTDT EXENDT DATE9.;
RUN;
PROC PRINT DATA=EX_DATES;
RUN;
OUTPUT:
| Obs | USUBJID | EXSTDTC | EXENDTC | EXDOSE | EXSTDT | EXENDT |
|---|---|---|---|---|---|---|
| 1 | ABC123-101 | 01JAN2025 | 05JAN2025 | 100 | 01JAN2025 | 05JAN2025 |
| 2 | ABC123-101 | 2025-01-10 | 2025-01-15 | 100 | 10JAN2025 | 15JAN2025 |
| 3 | ABC123-101 | 01JAN2025 | 05JAN2025 | 100 | 01JAN2025 | 05JAN2025 |
| 4 | ABC123-102 | 02/15/2025 | 200 | 15FEB2025 | . | |
| 5 | ABC123-102 | 2025-02-20 | 2025-02-25 | 200 | 20FEB2025 | 25FEB2025 |
| 6 | ABC123-103 | 01MAR2025 | 05MAR2025 | 150 | 01MAR2025 | 05MAR2025 |
| 7 | ABC123-103 | 2025/03/10 | 2025/03/15 | 150 | 10MAR2025 | 15MAR2025 |
| 8 | ABC123-104 | 04-01-2025 | 120 | 01APR2025 | . | |
| 9 | ABC123-104 | 15APR2025 | 20APR2025 | 120 | 15APR2025 | 20APR2025 |
| 10 | ABC123-105 | 2025-05-01 | 2025-05-10 | 100 | 01MAY2025 | 10MAY2025 |
| 11 | ABC123-105 | 2025-05-15 | 100 | 15MAY2025 | . | |
| 12 | ABC123-106 | 06/10/2025 | 06/15/2025 | 80 | 10JUN2025 | 15JUN2025 |
| 13 | ABC123-106 | 06/20/2025 | 06/25/2025 | 80 | 20JUN2025 | 25JUN2025 |
| 14 | ABC123-107 | 07-01-2025 | 90 | 01JUL2025 | . | |
| 15 | ABC123-108 | 2025-08-20 | 2025-08-30 | 110 | 20AUG2025 | 30AUG2025 |
| 16 | ABC123-109 | 09SEP2025 | 15SEP2025 | 130 | 09SEP2025 | 15SEP2025 |
| 17 | ABC123-110 | 2025/10/01 | 140 | 01OCT2025 | . | |
| 18 | ABC123-111 | 11NOV2025 | 15NOV2025 | 150 | 11NOV2025 | 15NOV2025 |
| 19 | ABC123-112 | 2025-12-01 | 2025-12-10 | 160 | 01DEC2025 | 10DEC2025 |
| 20 | ABC123-113 | 01/03/2025 | 170 | 03JAN2025 | . | |
| 21 | ABC123-114 | 14FEB2025 | 18FEB2025 | 180 | 14FEB2025 | 18FEB2025 |
| 22 | ABC123-115 | 2025-03-15 | 2025-03-20 | 190 | 15MAR2025 | 20MAR2025 |
Explanation
Raw dates
are character values.
Need
conversion into numeric SAS dates.
INPUT Function
INPUT(variable, informat.)
Converts:
- Character → Numeric
ANYDTDTE Informat
ANYDTDTE15.
Very
powerful informat.
Reads
multiple formats:
|
Raw
Format |
Reads
Successfully |
|
01JAN2025 |
Yes |
|
2025-01-10 |
Yes |
|
02/15/2025 |
Yes |
|
2025/03/10 |
Yes |
This is
extremely common in clinical programming.
FORMAT Statement
FORMAT EXSTDT DATE9.;
Displays
date like:
01JAN2025
instead
of numeric values like:
23741
/*Step 3:Derive Treatment
Dates*/
PROC SQL;
CREATE TABLE EX_SUMMARY AS
SELECT USUBJID,
MIN(EXSTDT) AS TRTSDT FORMAT=DATE9.,
MAX(EXENDT) AS TRTEDT FORMAT=DATE9.
FROM EX_DATES
GROUP BY USUBJID;
QUIT;
PROC PRINT DATA=EX_SUMMARY;
RUN;
OUTPUT:
| Obs | USUBJID | TRTSDT | TRTEDT |
|---|---|---|---|
| 1 | ABC123-101 | 01JAN2025 | 15JAN2025 |
| 2 | ABC123-102 | 15FEB2025 | 25FEB2025 |
| 3 | ABC123-103 | 01MAR2025 | 15MAR2025 |
| 4 | ABC123-104 | 01APR2025 | 20APR2025 |
| 5 | ABC123-105 | 01MAY2025 | 10MAY2025 |
| 6 | ABC123-106 | 10JUN2025 | 25JUN2025 |
| 7 | ABC123-107 | 01JUL2025 | . |
| 8 | ABC123-108 | 20AUG2025 | 30AUG2025 |
| 9 | ABC123-109 | 09SEP2025 | 15SEP2025 |
| 10 | ABC123-110 | 01OCT2025 | . |
| 11 | ABC123-111 | 11NOV2025 | 15NOV2025 |
| 12 | ABC123-112 | 01DEC2025 | 10DEC2025 |
| 13 | ABC123-113 | 03JAN2025 | . |
| 14 | ABC123-114 | 14FEB2025 | 18FEB2025 |
| 15 | ABC123-115 | 15MAR2025 | 20MAR2025 |
Why MIN and MAX?
|
Variable |
Logic |
|
TRTSDT |
First
dose date |
|
TRTEDT |
Last
dose date |
GROUP BY
GROUP BY USUBJID
Performs
subject-level summarization.
Without
GROUP BY:
- Whole dataset summarized
together
Why PROC SQL?
PROC SQL
is powerful for:
- Aggregations
- Joins
- Summaries
Widely
used in industry.
/*Step 4: Create Death Flag*/
/*Using AE dataset.*/
DATA AE_CLEAN;
SET AE;
AEOUT = UPCASE(STRIP(AETERM));
IF AEOUT IN ("DEATH","FATAL") THEN DTHFL = "Y";
ELSE DTHFL = "N";
RUN;
PROC PRINT DATA=AE_CLEAN;
RUN;
OUTPUT:
| Obs | USUBJID | AETERM | AESTDTC | AEENDTC | AESEV | AESER | AEREL | AEOUT | DTHFL |
|---|---|---|---|---|---|---|---|---|---|
| 1 | ABC123-101 | Headache | 01JAN2025 | 05JAN2025 | Mild | Y | Related | HEADACHE | N |
| 2 | ABC123-101 | Fever | 2025-01-20 | 2025-01-25 | SEVERE | YES | RELATED | FEVER | N |
| 3 | ABC123-102 | Vomiting | 02/10/2025 | Moderate | N | Not Related | VOMITING | N | |
| 4 | ABC123-102 | Cold | 2025-02-28 | 2025-03-02 | mild | NO | Related | COLD | N |
| 5 | ABC123-103 | Headache | 01MAR2025 | 05MAR2025 | Severe | Y | RELATED | HEADACHE | N |
| 6 | ABC123-103 | Fever | 2025/03/20 | MODERATE | N | Not Related | FEVER | N | |
| 7 | ABC123-104 | Vomiting | 03-25-2025 | 04-01-2025 | Mild | No | Related | VOMITING | N |
| 8 | ABC123-104 | Cold | 15APR2025 | severe | YES | RELATED | COLD | N | |
| 9 | ABC123-105 | Headache | 2025-04-20 | 2025-04-22 | Moderate | N | Not Related | HEADACHE | N |
| 10 | ABC123-105 | Fever | 2025-05-18 | SEVERE | Y | Related | FEVER | N | |
| 11 | ABC123-106 | Vomiting | 06/05/2025 | 06/08/2025 | mild | NO | RELATED | VOMITING | N |
| 12 | ABC123-106 | Cold | 06/25/2025 | Moderate | YES | Not Related | COLD | N | |
| 13 | ABC123-107 | Headache | 06-25-2025 | 07-02-2025 | SEVERE | Y | Related | HEADACHE | N |
| 14 | ABC123-108 | Fever | 2025-08-15 | Mild | No | RELATED | FEVER | N | |
| 15 | ABC123-109 | Vomiting | 09SEP2025 | 12SEP2025 | severe | YES | Related | VOMITING | N |
| 16 | ABC123-110 | Cold | 2025/09/25 | Moderate | N | Not Related | COLD | N | |
| 17 | ABC123-111 | Headache | 11NOV2025 | 15NOV2025 | MILD | NO | RELATED | HEADACHE | N |
| 18 | ABC123-112 | Fever | 2025-12-05 | Severe | Y | Related | FEVER | N | |
| 19 | ABC123-113 | Vomiting | 01/02/2025 | 01/05/2025 | Moderate | YES | Not Related | VOMITING | N |
| 20 | ABC123-114 | Cold | 10FEB2025 | mild | N | RELATED | COLD | N | |
| 21 | ABC123-115 | Headache | 2025-03-10 | 2025-03-15 | SEVERE | YES | Related | HEADACHE | N |
| 22 | ABC123-101 | Headache | 01JAN2025 | 05JAN2025 | Mild | Y | Related | HEADACHE | N |
Explanation
STRIP Function
Removes
leading/trailing blanks.
Example:
" FEMALE "
becomes
"FEMALE"
UPCASE Function
Converts
to uppercase.
Important
for standardization.
Why DTHFL?
DTHFL =
Death Flag
Indicates:
- Subject died during study
Regulatory
importance:
- Safety analysis
- Survival analysis
/*Step 5:Subject-Level
Death Flag*/
PROC SQL;
CREATE TABLE DEATH_FLAG AS
SELECT USUBJID,MAX(DTHFL) AS DTHFL
FROM AE_CLEAN
GROUP BY USUBJID;
QUIT;
PROC PRINT DATA=DEATH_FLAG;
RUN;
OUTPUT:
| Obs | USUBJID | DTHFL |
|---|---|---|
| 1 | ABC123-101 | N |
| 2 | ABC123-102 | N |
| 3 | ABC123-103 | N |
| 4 | ABC123-104 | N |
| 5 | ABC123-105 | N |
| 6 | ABC123-106 | N |
| 7 | ABC123-107 | N |
| 8 | ABC123-108 | N |
| 9 | ABC123-109 | N |
| 10 | ABC123-110 | N |
| 11 | ABC123-111 | N |
| 12 | ABC123-112 | N |
| 13 | ABC123-113 | N |
| 14 | ABC123-114 | N |
| 15 | ABC123-115 | N |
Why MAX(DTHFL)?
Character
comparison:
- "Y" >
"N"
If any
record has Y:
- Subject gets Y
/*Step 6: Remove Duplicate Subjects from DM*/
PROC SORT DATA=DM NODUPKEY OUT=DM_CLEAN;
BY USUBJID;
RUN;
PROC PRINT DATA=DM_CLEAN;
RUN;
LOG:
OUTPUT:
| Obs | STUDYID | USUBJID | SEX | RACE | ARM | RFSTDTC | RFENDTC | SITEID | AGE |
|---|---|---|---|---|---|---|---|---|---|
| 1 | ABC123 | ABC123-101 | Male | Asian | Drug A | 01JAN2025 | 31JAN2025 | 1001 | 45 |
| 2 | ABC123 | ABC123-102 | FEMALE | White | PLACEBO | 02/15/2025 | 1001 | 52 | |
| 3 | ABC123 | ABC123-103 | m | Black | drug b | 01MAR2025 | 15MAR2025 | 1002 | 38 |
| 4 | ABC123 | ABC123-104 | f | Asian | Drug A | 04-01-2025 | 1002 | . | |
| 5 | ABC123 | ABC123-105 | MALE | White | Placebo | 2025-05-01 | 2025-05-20 | 1003 | 60 |
| 6 | ABC123 | ABC123-106 | Female | Asian | DRUG B | 06/10/2025 | 06/25/2025 | 1003 | 35 |
| 7 | ABC123 | ABC123-107 | M | Black | Drug A | 07-01-2025 | 1004 | 48 | |
| 8 | ABC123 | ABC123-108 | F | Asian | placebo | 2025-08-20 | 2025-08-30 | 1004 | 29 |
| 9 | ABC123 | ABC123-109 | unknown | White | Drug B | 09SEP2025 | 15SEP2025 | 1005 | 55 |
| 10 | ABC123 | ABC123-110 | male | Asian | DRUG A | 2025/10/01 | 1005 | 42 | |
| 11 | ABC123 | ABC123-111 | FEMALE | Black | Placebo | 11NOV2025 | 15NOV2025 | 1006 | 50 |
| 12 | ABC123 | ABC123-112 | m | White | Drug B | 2025-12-01 | 2025-12-10 | 1006 | 39 |
| 13 | ABC123 | ABC123-113 | Female | Asian | drug a | 01/03/2025 | 1007 | 33 | |
| 14 | ABC123 | ABC123-114 | M | Black | PLACEBO | 14FEB2025 | 18FEB2025 | 1007 | . |
| 15 | ABC123 | ABC123-115 | f | White | Drug B | 2025-03-15 | 2025-03-20 | 1008 | 41 |
| 16 | ABC123 | ABC123-116 | Male | Asian | Drug A | 04APR2025 | 1008 | 58 | |
| 17 | ABC123 | ABC123-117 | FEMALE | Black | Placebo | 2025-05-20 | 2025-05-25 | 1009 | 47 |
| 18 | ABC123 | ABC123-118 | m | Asian | drug b | 06/06/2025 | 1009 | 36 | |
| 19 | ABC123 | ABC123-119 | Female | White | Drug A | 07JUL2025 | 15JUL2025 | 1010 | 44 |
| 20 | ABC123 | ABC123-120 | Male | Asian | PLACEBO | 08AUG2025 | 1010 | 49 |
Why NODUPKEY?
Removes
duplicate keys.
Duplicate
subject:
ABC123-101
appears
twice.
Important
because:
- ADSL requires unique subject
Creating ADSL
PROC SORT DATA=EX_SUMMARY;
BY USUBJID;
RUN;
PROC PRINT DATA=EX_SUMMARY;
RUN;
OUTPUT:
| Obs | USUBJID | TRTSDT | TRTEDT |
|---|---|---|---|
| 1 | ABC123-101 | 01JAN2025 | 15JAN2025 |
| 2 | ABC123-102 | 15FEB2025 | 25FEB2025 |
| 3 | ABC123-103 | 01MAR2025 | 15MAR2025 |
| 4 | ABC123-104 | 01APR2025 | 20APR2025 |
| 5 | ABC123-105 | 01MAY2025 | 10MAY2025 |
| 6 | ABC123-106 | 10JUN2025 | 25JUN2025 |
| 7 | ABC123-107 | 01JUL2025 | . |
| 8 | ABC123-108 | 20AUG2025 | 30AUG2025 |
| 9 | ABC123-109 | 09SEP2025 | 15SEP2025 |
| 10 | ABC123-110 | 01OCT2025 | . |
| 11 | ABC123-111 | 11NOV2025 | 15NOV2025 |
| 12 | ABC123-112 | 01DEC2025 | 10DEC2025 |
| 13 | ABC123-113 | 03JAN2025 | . |
| 14 | ABC123-114 | 14FEB2025 | 18FEB2025 |
| 15 | ABC123-115 | 15MAR2025 | 20MAR2025 |
PROC SORT DATA=DEATH_FLAG;
BY USUBJID;
RUN;
PROC PRINT DATA=DEATH_FLAG;
RUN;
OUTPUT:
| Obs | USUBJID | DTHFL |
|---|---|---|
| 1 | ABC123-101 | N |
| 2 | ABC123-102 | N |
| 3 | ABC123-103 | N |
| 4 | ABC123-104 | N |
| 5 | ABC123-105 | N |
| 6 | ABC123-106 | N |
| 7 | ABC123-107 | N |
| 8 | ABC123-108 | N |
| 9 | ABC123-109 | N |
| 10 | ABC123-110 | N |
| 11 | ABC123-111 | N |
| 12 | ABC123-112 | N |
| 13 | ABC123-113 | N |
| 14 | ABC123-114 | N |
| 15 | ABC123-115 | N |
/*Step 7 :Merge Datasets*/
DATA ADSL;
MERGE DM_CLEAN
EX_SUMMARY
DEATH_FLAG;
BY USUBJID;
length TRT01A $20 SAFFL $1 ITTFL $1 EOSFL $1 DTHFL $1;
/* SEX Standardization */
SEX = UPCASE(STRIP(SEX));
IF SEX IN ("MALE","M") THEN SEX = "M";
ELSE IF SEX IN ("FEMALE","F") THEN SEX = "F";
ELSE SEX = "U";
/* RACE */
RACE = UPCASE(STRIP(RACE));
/* Actual Treatment */
TRT01A = UPCASE(STRIP(ARM));
/* SAFFL */
IF NOT MISSING(TRTSDT) THEN SAFFL = "Y";
ELSE SAFFL = "N";
/* ITTFL */
IF NOT MISSING(TRT01A) THEN ITTFL = "Y";
ELSE ITTFL = "N";
/* EOSFL */
IF NOT MISSING(RFENDTC) THEN EOSFL = "Y";
ELSE EOSFL = "N";
/* DTHFL*/
IF MISSING(DTHFL) THEN DTHFL = "N";
KEEP STUDYID USUBJID SITEID SEX RACE TRT01A
TRTSDT TRTEDT AGE SAFFL ITTFL EOSFL DTHFL;
RUN;
PROC PRINT DATA=ADSL;
RUN;
OUTPUT:
| Obs | STUDYID | USUBJID | SEX | RACE | SITEID | AGE | TRTSDT | TRTEDT | DTHFL | TRT01A | SAFFL | ITTFL | EOSFL |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | ABC123 | ABC123-101 | M | ASIAN | 1001 | 45 | 01JAN2025 | 15JAN2025 | N | DRUG A | Y | Y | Y |
| 2 | ABC123 | ABC123-102 | F | WHITE | 1001 | 52 | 15FEB2025 | 25FEB2025 | N | PLACEBO | Y | Y | N |
| 3 | ABC123 | ABC123-103 | M | BLACK | 1002 | 38 | 01MAR2025 | 15MAR2025 | N | DRUG B | Y | Y | Y |
| 4 | ABC123 | ABC123-104 | F | ASIAN | 1002 | . | 01APR2025 | 20APR2025 | N | DRUG A | Y | Y | N |
| 5 | ABC123 | ABC123-105 | M | WHITE | 1003 | 60 | 01MAY2025 | 10MAY2025 | N | PLACEBO | Y | Y | Y |
| 6 | ABC123 | ABC123-106 | F | ASIAN | 1003 | 35 | 10JUN2025 | 25JUN2025 | N | DRUG B | Y | Y | Y |
| 7 | ABC123 | ABC123-107 | M | BLACK | 1004 | 48 | 01JUL2025 | . | N | DRUG A | Y | Y | N |
| 8 | ABC123 | ABC123-108 | F | ASIAN | 1004 | 29 | 20AUG2025 | 30AUG2025 | N | PLACEBO | Y | Y | Y |
| 9 | ABC123 | ABC123-109 | U | WHITE | 1005 | 55 | 09SEP2025 | 15SEP2025 | N | DRUG B | Y | Y | Y |
| 10 | ABC123 | ABC123-110 | M | ASIAN | 1005 | 42 | 01OCT2025 | . | N | DRUG A | Y | Y | N |
| 11 | ABC123 | ABC123-111 | F | BLACK | 1006 | 50 | 11NOV2025 | 15NOV2025 | N | PLACEBO | Y | Y | Y |
| 12 | ABC123 | ABC123-112 | M | WHITE | 1006 | 39 | 01DEC2025 | 10DEC2025 | N | DRUG B | Y | Y | Y |
| 13 | ABC123 | ABC123-113 | F | ASIAN | 1007 | 33 | 03JAN2025 | . | N | DRUG A | Y | Y | N |
| 14 | ABC123 | ABC123-114 | M | BLACK | 1007 | . | 14FEB2025 | 18FEB2025 | N | PLACEBO | Y | Y | Y |
| 15 | ABC123 | ABC123-115 | F | WHITE | 1008 | 41 | 15MAR2025 | 20MAR2025 | N | DRUG B | Y | Y | Y |
| 16 | ABC123 | ABC123-116 | M | ASIAN | 1008 | 58 | . | . | N | DRUG A | N | Y | N |
| 17 | ABC123 | ABC123-117 | F | BLACK | 1009 | 47 | . | . | N | PLACEBO | N | Y | Y |
| 18 | ABC123 | ABC123-118 | M | ASIAN | 1009 | 36 | . | . | N | DRUG B | N | Y | N |
| 19 | ABC123 | ABC123-119 | F | WHITE | 1010 | 44 | . | . | N | DRUG A | N | Y | Y |
| 20 | ABC123 | ABC123-120 | M | ASIAN | 1010 | 49 | . | . | N | PLACEBO | N | Y | N |
Why MERGE?
Combines:
- Demographics
- Exposure summary
- Death information
using:
BY USUBJID
Why?
Raw data
inconsistent:
|
Raw |
Standard |
|
Male |
M |
|
male |
M |
|
FEMALE |
F |
|
unknown |
U |
Standardization
is mandatory.
Actual
treatment assigned.
Examples:
- DRUG A
- PLACEBO
- DRUG B
Safety
Population Flag.
Meaning:
- Subject received treatment
Intent-to-Treat
Population.
Used in
efficacy analyses.
End of
Study Flag.
Keeps
only required variables.
Benefits:
- Reduces dataset size
- Cleaner output
- Better performance
Creating ADAE Dataset
ADAE =
Analysis Dataset for Adverse Events.
/*Step 1: Clean and Prepare AE*/
DATA AE_CLEAN;
SET AE;
LENGTH PARAMCD $10 PARAM $30 ASEV $15 ASER $1
AREL $20;
/* CONVERT DATES */
ASTDT = INPUT(AESTDTC, ANYDTDTE15.);
AENDT = INPUT(AEENDTC, ANYDTDTE15.);
FORMAT ASTDT AENDT DATE9.;
/* STANDARDIZE SEVERITY */
ASEV = UPCASE(STRIP(AESEV));
/* SERIOUS FLAG */
ASER = UPCASE(STRIP(AESER));
IF ASER IN ("YES","Y") THEN ASER = "Y";
ELSE ASER = "N";
/* RELATIONSHIP */
AREL = UPCASE(STRIP(AEREL));
/* PARAMCD / PARAM MAPPING */
AETERM = UPCASE(STRIP(AETERM));
IF AETERM = "HEADACHE" THEN DO;
PARAMCD = "HEAD";
PARAM = "HEADACHE";
END;
ELSE IF AETERM = "FEVER" THEN DO;
PARAMCD = "PYRX";
PARAM = "PYREXIA";
END;
ELSE IF AETERM = "VOMITING" THEN DO;
PARAMCD = "VOM";
PARAM = "VOMITING";
END;
ELSE IF AETERM = "COLD" THEN DO;
PARAMCD = "COLD";
PARAM = "COMMON COLD";
END;
ELSE DO;
PARAMCD = "OTHER";
PARAM = "OTHER";
END;
RUN;
PROC PRINT DATA=AE_CLEAN;
RUN;
OUTPUT:
| Obs | USUBJID | AETERM | AESTDTC | AEENDTC | AESEV | AESER | AEREL | PARAMCD | PARAM | ASEV | ASER | AREL | ASTDT | AENDT |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | ABC123-101 | HEADACHE | 01JAN2025 | 05JAN2025 | Mild | Y | Related | HEAD | HEADACHE | MILD | Y | RELATED | 01JAN2025 | 05JAN2025 |
| 2 | ABC123-101 | FEVER | 2025-01-20 | 2025-01-25 | SEVERE | YES | RELATED | PYRX | PYREXIA | SEVERE | Y | RELATED | 20JAN2025 | 25JAN2025 |
| 3 | ABC123-102 | VOMITING | 02/10/2025 | Moderate | N | Not Related | VOM | VOMITING | MODERATE | N | NOT RELATED | 10FEB2025 | . | |
| 4 | ABC123-102 | COLD | 2025-02-28 | 2025-03-02 | mild | NO | Related | COLD | COMMON COLD | MILD | N | RELATED | 28FEB2025 | 02MAR2025 |
| 5 | ABC123-103 | HEADACHE | 01MAR2025 | 05MAR2025 | Severe | Y | RELATED | HEAD | HEADACHE | SEVERE | Y | RELATED | 01MAR2025 | 05MAR2025 |
| 6 | ABC123-103 | FEVER | 2025/03/20 | MODERATE | N | Not Related | PYRX | PYREXIA | MODERATE | N | NOT RELATED | 20MAR2025 | . | |
| 7 | ABC123-104 | VOMITING | 03-25-2025 | 04-01-2025 | Mild | No | Related | VOM | VOMITING | MILD | N | RELATED | 25MAR2025 | 01APR2025 |
| 8 | ABC123-104 | COLD | 15APR2025 | severe | YES | RELATED | COLD | COMMON COLD | SEVERE | Y | RELATED | 15APR2025 | . | |
| 9 | ABC123-105 | HEADACHE | 2025-04-20 | 2025-04-22 | Moderate | N | Not Related | HEAD | HEADACHE | MODERATE | N | NOT RELATED | 20APR2025 | 22APR2025 |
| 10 | ABC123-105 | FEVER | 2025-05-18 | SEVERE | Y | Related | PYRX | PYREXIA | SEVERE | Y | RELATED | 18MAY2025 | . | |
| 11 | ABC123-106 | VOMITING | 06/05/2025 | 06/08/2025 | mild | NO | RELATED | VOM | VOMITING | MILD | N | RELATED | 05JUN2025 | 08JUN2025 |
| 12 | ABC123-106 | COLD | 06/25/2025 | Moderate | YES | Not Related | COLD | COMMON COLD | MODERATE | Y | NOT RELATED | 25JUN2025 | . | |
| 13 | ABC123-107 | HEADACHE | 06-25-2025 | 07-02-2025 | SEVERE | Y | Related | HEAD | HEADACHE | SEVERE | Y | RELATED | 25JUN2025 | 02JUL2025 |
| 14 | ABC123-108 | FEVER | 2025-08-15 | Mild | No | RELATED | PYRX | PYREXIA | MILD | N | RELATED | 15AUG2025 | . | |
| 15 | ABC123-109 | VOMITING | 09SEP2025 | 12SEP2025 | severe | YES | Related | VOM | VOMITING | SEVERE | Y | RELATED | 09SEP2025 | 12SEP2025 |
| 16 | ABC123-110 | COLD | 2025/09/25 | Moderate | N | Not Related | COLD | COMMON COLD | MODERATE | N | NOT RELATED | 25SEP2025 | . | |
| 17 | ABC123-111 | HEADACHE | 11NOV2025 | 15NOV2025 | MILD | NO | RELATED | HEAD | HEADACHE | MILD | N | RELATED | 11NOV2025 | 15NOV2025 |
| 18 | ABC123-112 | FEVER | 2025-12-05 | Severe | Y | Related | PYRX | PYREXIA | SEVERE | Y | RELATED | 05DEC2025 | . | |
| 19 | ABC123-113 | VOMITING | 01/02/2025 | 01/05/2025 | Moderate | YES | Not Related | VOM | VOMITING | MODERATE | Y | NOT RELATED | 02JAN2025 | 05JAN2025 |
| 20 | ABC123-114 | COLD | 10FEB2025 | mild | N | RELATED | COLD | COMMON COLD | MILD | N | RELATED | 10FEB2025 | . | |
| 21 | ABC123-115 | HEADACHE | 2025-03-10 | 2025-03-15 | SEVERE | YES | Related | HEAD | HEADACHE | SEVERE | Y | RELATED | 10MAR2025 | 15MAR2025 |
| 22 | ABC123-101 | HEADACHE | 01JAN2025 | 05JAN2025 | Mild | Y | Related | HEAD | HEADACHE | MILD | Y | RELATED | 01JAN2025 | 05JAN2025 |
Explanation
Converts
AE dates into SAS dates.
Examples:
|
Raw |
Standard |
|
mild |
MILD |
|
severe |
SEVERE |
Standardizes:
- YES
- Y
- NO
- N
Makes
relationship values consistent.
Why PARAMCD?
ADaM
standards require:
- Short coded parameter
- Readable parameter name
Used in:
- Tables
- Listings
- Analysis
/*Step 2: Remove Duplicates*/
/*Based on:*/
/*USUBJID + AETERM + ASTDT*/
PROC SORT DATA=AE_CLEAN NODUPKEY;
BY USUBJID AETERM ASTDT;
RUN;
PROC PRINT DATA=AE_CLEAN;
RUN;
OUTPUT:
| Obs | USUBJID | AETERM | AESTDTC | AEENDTC | AESEV | AESER | AEREL | PARAMCD | PARAM | ASEV | ASER | AREL | ASTDT | AENDT |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | ABC123-101 | FEVER | 2025-01-20 | 2025-01-25 | SEVERE | YES | RELATED | PYRX | PYREXIA | SEVERE | Y | RELATED | 20JAN2025 | 25JAN2025 |
| 2 | ABC123-101 | HEADACHE | 01JAN2025 | 05JAN2025 | Mild | Y | Related | HEAD | HEADACHE | MILD | Y | RELATED | 01JAN2025 | 05JAN2025 |
| 3 | ABC123-102 | COLD | 2025-02-28 | 2025-03-02 | mild | NO | Related | COLD | COMMON COLD | MILD | N | RELATED | 28FEB2025 | 02MAR2025 |
| 4 | ABC123-102 | VOMITING | 02/10/2025 | Moderate | N | Not Related | VOM | VOMITING | MODERATE | N | NOT RELATED | 10FEB2025 | . | |
| 5 | ABC123-103 | FEVER | 2025/03/20 | MODERATE | N | Not Related | PYRX | PYREXIA | MODERATE | N | NOT RELATED | 20MAR2025 | . | |
| 6 | ABC123-103 | HEADACHE | 01MAR2025 | 05MAR2025 | Severe | Y | RELATED | HEAD | HEADACHE | SEVERE | Y | RELATED | 01MAR2025 | 05MAR2025 |
| 7 | ABC123-104 | COLD | 15APR2025 | severe | YES | RELATED | COLD | COMMON COLD | SEVERE | Y | RELATED | 15APR2025 | . | |
| 8 | ABC123-104 | VOMITING | 03-25-2025 | 04-01-2025 | Mild | No | Related | VOM | VOMITING | MILD | N | RELATED | 25MAR2025 | 01APR2025 |
| 9 | ABC123-105 | FEVER | 2025-05-18 | SEVERE | Y | Related | PYRX | PYREXIA | SEVERE | Y | RELATED | 18MAY2025 | . | |
| 10 | ABC123-105 | HEADACHE | 2025-04-20 | 2025-04-22 | Moderate | N | Not Related | HEAD | HEADACHE | MODERATE | N | NOT RELATED | 20APR2025 | 22APR2025 |
| 11 | ABC123-106 | COLD | 06/25/2025 | Moderate | YES | Not Related | COLD | COMMON COLD | MODERATE | Y | NOT RELATED | 25JUN2025 | . | |
| 12 | ABC123-106 | VOMITING | 06/05/2025 | 06/08/2025 | mild | NO | RELATED | VOM | VOMITING | MILD | N | RELATED | 05JUN2025 | 08JUN2025 |
| 13 | ABC123-107 | HEADACHE | 06-25-2025 | 07-02-2025 | SEVERE | Y | Related | HEAD | HEADACHE | SEVERE | Y | RELATED | 25JUN2025 | 02JUL2025 |
| 14 | ABC123-108 | FEVER | 2025-08-15 | Mild | No | RELATED | PYRX | PYREXIA | MILD | N | RELATED | 15AUG2025 | . | |
| 15 | ABC123-109 | VOMITING | 09SEP2025 | 12SEP2025 | severe | YES | Related | VOM | VOMITING | SEVERE | Y | RELATED | 09SEP2025 | 12SEP2025 |
| 16 | ABC123-110 | COLD | 2025/09/25 | Moderate | N | Not Related | COLD | COMMON COLD | MODERATE | N | NOT RELATED | 25SEP2025 | . | |
| 17 | ABC123-111 | HEADACHE | 11NOV2025 | 15NOV2025 | MILD | NO | RELATED | HEAD | HEADACHE | MILD | N | RELATED | 11NOV2025 | 15NOV2025 |
| 18 | ABC123-112 | FEVER | 2025-12-05 | Severe | Y | Related | PYRX | PYREXIA | SEVERE | Y | RELATED | 05DEC2025 | . | |
| 19 | ABC123-113 | VOMITING | 01/02/2025 | 01/05/2025 | Moderate | YES | Not Related | VOM | VOMITING | MODERATE | Y | NOT RELATED | 02JAN2025 | 05JAN2025 |
| 20 | ABC123-114 | COLD | 10FEB2025 | mild | N | RELATED | COLD | COMMON COLD | MILD | N | RELATED | 10FEB2025 | . | |
| 21 | ABC123-115 | HEADACHE | 2025-03-10 | 2025-03-15 | SEVERE | YES | Related | HEAD | HEADACHE | SEVERE | Y | RELATED | 10MAR2025 | 15MAR2025 |
Why These Keys?
Uniqueness
based on:
- Subject
- Event
- Start Date
Industry-standard
duplicate handling.
/*STEP 3: MERGE TRTSDT FROM ADSL*/
PROC SORT DATA=ADSL;
BY USUBJID;
RUN;
PROC PRINT DATA=ADSL;
RUN;
OUTPUT:
| Obs | STUDYID | USUBJID | SEX | RACE | SITEID | AGE | TRTSDT | TRTEDT | DTHFL | TRT01A | SAFFL | ITTFL | EOSFL |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | ABC123 | ABC123-101 | M | ASIAN | 1001 | 45 | 01JAN2025 | 15JAN2025 | N | DRUG A | Y | Y | Y |
| 2 | ABC123 | ABC123-102 | F | WHITE | 1001 | 52 | 15FEB2025 | 25FEB2025 | N | PLACEBO | Y | Y | N |
| 3 | ABC123 | ABC123-103 | M | BLACK | 1002 | 38 | 01MAR2025 | 15MAR2025 | N | DRUG B | Y | Y | Y |
| 4 | ABC123 | ABC123-104 | F | ASIAN | 1002 | . | 01APR2025 | 20APR2025 | N | DRUG A | Y | Y | N |
| 5 | ABC123 | ABC123-105 | M | WHITE | 1003 | 60 | 01MAY2025 | 10MAY2025 | N | PLACEBO | Y | Y | Y |
| 6 | ABC123 | ABC123-106 | F | ASIAN | 1003 | 35 | 10JUN2025 | 25JUN2025 | N | DRUG B | Y | Y | Y |
| 7 | ABC123 | ABC123-107 | M | BLACK | 1004 | 48 | 01JUL2025 | . | N | DRUG A | Y | Y | N |
| 8 | ABC123 | ABC123-108 | F | ASIAN | 1004 | 29 | 20AUG2025 | 30AUG2025 | N | PLACEBO | Y | Y | Y |
| 9 | ABC123 | ABC123-109 | U | WHITE | 1005 | 55 | 09SEP2025 | 15SEP2025 | N | DRUG B | Y | Y | Y |
| 10 | ABC123 | ABC123-110 | M | ASIAN | 1005 | 42 | 01OCT2025 | . | N | DRUG A | Y | Y | N |
| 11 | ABC123 | ABC123-111 | F | BLACK | 1006 | 50 | 11NOV2025 | 15NOV2025 | N | PLACEBO | Y | Y | Y |
| 12 | ABC123 | ABC123-112 | M | WHITE | 1006 | 39 | 01DEC2025 | 10DEC2025 | N | DRUG B | Y | Y | Y |
| 13 | ABC123 | ABC123-113 | F | ASIAN | 1007 | 33 | 03JAN2025 | . | N | DRUG A | Y | Y | N |
| 14 | ABC123 | ABC123-114 | M | BLACK | 1007 | . | 14FEB2025 | 18FEB2025 | N | PLACEBO | Y | Y | Y |
| 15 | ABC123 | ABC123-115 | F | WHITE | 1008 | 41 | 15MAR2025 | 20MAR2025 | N | DRUG B | Y | Y | Y |
| 16 | ABC123 | ABC123-116 | M | ASIAN | 1008 | 58 | . | . | N | DRUG A | N | Y | N |
| 17 | ABC123 | ABC123-117 | F | BLACK | 1009 | 47 | . | . | N | PLACEBO | N | Y | Y |
| 18 | ABC123 | ABC123-118 | M | ASIAN | 1009 | 36 | . | . | N | DRUG B | N | Y | N |
| 19 | ABC123 | ABC123-119 | F | WHITE | 1010 | 44 | . | . | N | DRUG A | N | Y | Y |
| 20 | ABC123 | ABC123-120 | M | ASIAN | 1010 | 49 | . | . | N | PLACEBO | N | Y | N |
DATA ADAE_PRE;
MERGE AE_CLEAN(IN=A)
ADSL(KEEP=USUBJID TRTSDT);
BY USUBJID;
IF A;
RUN;
PROC PRINT DATA=ADAE_PRE;;
RUN;
OUTPUT:
| Obs | USUBJID | AETERM | AESTDTC | AEENDTC | AESEV | AESER | AEREL | PARAMCD | PARAM | ASEV | ASER | AREL | ASTDT | AENDT | TRTSDT |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | ABC123-101 | FEVER | 2025-01-20 | 2025-01-25 | SEVERE | YES | RELATED | PYRX | PYREXIA | SEVERE | Y | RELATED | 20JAN2025 | 25JAN2025 | 01JAN2025 |
| 2 | ABC123-101 | HEADACHE | 01JAN2025 | 05JAN2025 | Mild | Y | Related | HEAD | HEADACHE | MILD | Y | RELATED | 01JAN2025 | 05JAN2025 | 01JAN2025 |
| 3 | ABC123-102 | COLD | 2025-02-28 | 2025-03-02 | mild | NO | Related | COLD | COMMON COLD | MILD | N | RELATED | 28FEB2025 | 02MAR2025 | 15FEB2025 |
| 4 | ABC123-102 | VOMITING | 02/10/2025 | Moderate | N | Not Related | VOM | VOMITING | MODERATE | N | NOT RELATED | 10FEB2025 | . | 15FEB2025 | |
| 5 | ABC123-103 | FEVER | 2025/03/20 | MODERATE | N | Not Related | PYRX | PYREXIA | MODERATE | N | NOT RELATED | 20MAR2025 | . | 01MAR2025 | |
| 6 | ABC123-103 | HEADACHE | 01MAR2025 | 05MAR2025 | Severe | Y | RELATED | HEAD | HEADACHE | SEVERE | Y | RELATED | 01MAR2025 | 05MAR2025 | 01MAR2025 |
| 7 | ABC123-104 | COLD | 15APR2025 | severe | YES | RELATED | COLD | COMMON COLD | SEVERE | Y | RELATED | 15APR2025 | . | 01APR2025 | |
| 8 | ABC123-104 | VOMITING | 03-25-2025 | 04-01-2025 | Mild | No | Related | VOM | VOMITING | MILD | N | RELATED | 25MAR2025 | 01APR2025 | 01APR2025 |
| 9 | ABC123-105 | FEVER | 2025-05-18 | SEVERE | Y | Related | PYRX | PYREXIA | SEVERE | Y | RELATED | 18MAY2025 | . | 01MAY2025 | |
| 10 | ABC123-105 | HEADACHE | 2025-04-20 | 2025-04-22 | Moderate | N | Not Related | HEAD | HEADACHE | MODERATE | N | NOT RELATED | 20APR2025 | 22APR2025 | 01MAY2025 |
| 11 | ABC123-106 | COLD | 06/25/2025 | Moderate | YES | Not Related | COLD | COMMON COLD | MODERATE | Y | NOT RELATED | 25JUN2025 | . | 10JUN2025 | |
| 12 | ABC123-106 | VOMITING | 06/05/2025 | 06/08/2025 | mild | NO | RELATED | VOM | VOMITING | MILD | N | RELATED | 05JUN2025 | 08JUN2025 | 10JUN2025 |
| 13 | ABC123-107 | HEADACHE | 06-25-2025 | 07-02-2025 | SEVERE | Y | Related | HEAD | HEADACHE | SEVERE | Y | RELATED | 25JUN2025 | 02JUL2025 | 01JUL2025 |
| 14 | ABC123-108 | FEVER | 2025-08-15 | Mild | No | RELATED | PYRX | PYREXIA | MILD | N | RELATED | 15AUG2025 | . | 20AUG2025 | |
| 15 | ABC123-109 | VOMITING | 09SEP2025 | 12SEP2025 | severe | YES | Related | VOM | VOMITING | SEVERE | Y | RELATED | 09SEP2025 | 12SEP2025 | 09SEP2025 |
| 16 | ABC123-110 | COLD | 2025/09/25 | Moderate | N | Not Related | COLD | COMMON COLD | MODERATE | N | NOT RELATED | 25SEP2025 | . | 01OCT2025 | |
| 17 | ABC123-111 | HEADACHE | 11NOV2025 | 15NOV2025 | MILD | NO | RELATED | HEAD | HEADACHE | MILD | N | RELATED | 11NOV2025 | 15NOV2025 | 11NOV2025 |
| 18 | ABC123-112 | FEVER | 2025-12-05 | Severe | Y | Related | PYRX | PYREXIA | SEVERE | Y | RELATED | 05DEC2025 | . | 01DEC2025 | |
| 19 | ABC123-113 | VOMITING | 01/02/2025 | 01/05/2025 | Moderate | YES | Not Related | VOM | VOMITING | MODERATE | Y | NOT RELATED | 02JAN2025 | 05JAN2025 | 03JAN2025 |
| 20 | ABC123-114 | COLD | 10FEB2025 | mild | N | RELATED | COLD | COMMON COLD | MILD | N | RELATED | 10FEB2025 | . | 14FEB2025 | |
| 21 | ABC123-115 | HEADACHE | 2025-03-10 | 2025-03-15 | SEVERE | YES | Related | HEAD | HEADACHE | SEVERE | Y | RELATED | 10MAR2025 | 15MAR2025 | 15MAR2025 |
Why Merge TRTSDT?
Needed to
derive:
TRTEMFL
Treatment
Emergent Flag.
IN= Dataset Option
(in=a)
Creates
temporary variable.
Used to
keep only AE records.
/*STEP 4: CREATE ADAE*/
DATA ADAE;
SET ADAE_PRE;
LENGTH STUDYID $10 TRTEMFL $1 SAEFL $1 SEVFL $1 RELFL $1;
/* STUDYID */
STUDYID = "ABC123";
/* AESEQ */
BY USUBJID;
RETAIN AESEQ;
IF FIRST.USUBJID THEN AESEQ = 1;
ELSE AESEQ + 1;
/* TRTEMFL */
IF ASTDT >= TRTSDT THEN TRTEMFL = "Y";
ELSE TRTEMFL = "N";
/* SAEFL */
IF ASER = "Y" THEN SAEFL = "Y";
ELSE SAEFL = "N";
/* SEVFL */
IF ASEV = "SEVERE" THEN SEVFL = "Y";
ELSE SEVFL = "N";
/* RELFL */
IF AREL = "RELATED" THEN RELFL = "Y";
ELSE RELFL = "N";
KEEP STUDYID USUBJID AESEQ PARAMCD PARAM ASTDT AENDT
TRTEMFL ASEV ASER AREL SAEFL SEVFL RELFL;
RUN;
PROC PRINT DATA=ADAE;
RUN;
OUTPUT:
| Obs | USUBJID | PARAMCD | PARAM | ASEV | ASER | AREL | ASTDT | AENDT | STUDYID | TRTEMFL | SAEFL | SEVFL | RELFL | AESEQ |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | ABC123-101 | PYRX | PYREXIA | SEVERE | Y | RELATED | 20JAN2025 | 25JAN2025 | ABC123 | Y | Y | Y | Y | 1 |
| 2 | ABC123-101 | HEAD | HEADACHE | MILD | Y | RELATED | 01JAN2025 | 05JAN2025 | ABC123 | Y | Y | N | Y | 2 |
| 3 | ABC123-102 | COLD | COMMON COLD | MILD | N | RELATED | 28FEB2025 | 02MAR2025 | ABC123 | Y | N | N | Y | 1 |
| 4 | ABC123-102 | VOM | VOMITING | MODERATE | N | NOT RELATED | 10FEB2025 | . | ABC123 | N | N | N | N | 2 |
| 5 | ABC123-103 | PYRX | PYREXIA | MODERATE | N | NOT RELATED | 20MAR2025 | . | ABC123 | Y | N | N | N | 1 |
| 6 | ABC123-103 | HEAD | HEADACHE | SEVERE | Y | RELATED | 01MAR2025 | 05MAR2025 | ABC123 | Y | Y | Y | Y | 2 |
| 7 | ABC123-104 | COLD | COMMON COLD | SEVERE | Y | RELATED | 15APR2025 | . | ABC123 | Y | Y | Y | Y | 1 |
| 8 | ABC123-104 | VOM | VOMITING | MILD | N | RELATED | 25MAR2025 | 01APR2025 | ABC123 | N | N | N | Y | 2 |
| 9 | ABC123-105 | PYRX | PYREXIA | SEVERE | Y | RELATED | 18MAY2025 | . | ABC123 | Y | Y | Y | Y | 1 |
| 10 | ABC123-105 | HEAD | HEADACHE | MODERATE | N | NOT RELATED | 20APR2025 | 22APR2025 | ABC123 | N | N | N | N | 2 |
| 11 | ABC123-106 | COLD | COMMON COLD | MODERATE | Y | NOT RELATED | 25JUN2025 | . | ABC123 | Y | Y | N | N | 1 |
| 12 | ABC123-106 | VOM | VOMITING | MILD | N | RELATED | 05JUN2025 | 08JUN2025 | ABC123 | N | N | N | Y | 2 |
| 13 | ABC123-107 | HEAD | HEADACHE | SEVERE | Y | RELATED | 25JUN2025 | 02JUL2025 | ABC123 | N | Y | Y | Y | 1 |
| 14 | ABC123-108 | PYRX | PYREXIA | MILD | N | RELATED | 15AUG2025 | . | ABC123 | N | N | N | Y | 1 |
| 15 | ABC123-109 | VOM | VOMITING | SEVERE | Y | RELATED | 09SEP2025 | 12SEP2025 | ABC123 | Y | Y | Y | Y | 1 |
| 16 | ABC123-110 | COLD | COMMON COLD | MODERATE | N | NOT RELATED | 25SEP2025 | . | ABC123 | N | N | N | N | 1 |
| 17 | ABC123-111 | HEAD | HEADACHE | MILD | N | RELATED | 11NOV2025 | 15NOV2025 | ABC123 | Y | N | N | Y | 1 |
| 18 | ABC123-112 | PYRX | PYREXIA | SEVERE | Y | RELATED | 05DEC2025 | . | ABC123 | Y | Y | Y | Y | 1 |
| 19 | ABC123-113 | VOM | VOMITING | MODERATE | Y | NOT RELATED | 02JAN2025 | 05JAN2025 | ABC123 | N | Y | N | N | 1 |
| 20 | ABC123-114 | COLD | COMMON COLD | MILD | N | RELATED | 10FEB2025 | . | ABC123 | N | N | N | Y | 1 |
| 21 | ABC123-115 | HEAD | HEADACHE | SEVERE | Y | RELATED | 10MAR2025 | 15MAR2025 | ABC123 | N | Y | Y | Y | 1 |
Prevents
variable reset.
Without
RETAIN:
- AESEQ resets every row
Automatic
BY-group variable.
Used
after sorting.
Indicates:
- First record for subject
Treatment
Emergent AE.
Very
important safety variable.
Serious
AE flag.
Severe AE
flag.
Related-to-treatment
flag.
Final ADAE Variables
|
Variable |
Purpose |
|
AESEQ |
Sequence
number |
|
PARAMCD |
Short
code |
|
PARAM |
Parameter
name |
|
TRTEMFL |
Treatment
emergent |
|
SAEFL |
Serious
event |
|
SEVFL |
Severe
event |
|
RELFL |
Related
event |
/*STEP 5: CREATE SERIOUS AE DATASET*/
DATA ADAE_SERIOUS;
SET ADAE;
WHERE SAEFL = "Y";
RUN;
PROC PRINT DATA=ADAE_SERIOUS;
RUN;
OUTPUT:
| Obs | USUBJID | PARAMCD | PARAM | ASEV | ASER | AREL | ASTDT | AENDT | STUDYID | TRTEMFL | SAEFL | SEVFL | RELFL | AESEQ |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | ABC123-101 | PYRX | PYREXIA | SEVERE | Y | RELATED | 20JAN2025 | 25JAN2025 | ABC123 | Y | Y | Y | Y | 1 |
| 2 | ABC123-101 | HEAD | HEADACHE | MILD | Y | RELATED | 01JAN2025 | 05JAN2025 | ABC123 | Y | Y | N | Y | 2 |
| 3 | ABC123-103 | HEAD | HEADACHE | SEVERE | Y | RELATED | 01MAR2025 | 05MAR2025 | ABC123 | Y | Y | Y | Y | 2 |
| 4 | ABC123-104 | COLD | COMMON COLD | SEVERE | Y | RELATED | 15APR2025 | . | ABC123 | Y | Y | Y | Y | 1 |
| 5 | ABC123-105 | PYRX | PYREXIA | SEVERE | Y | RELATED | 18MAY2025 | . | ABC123 | Y | Y | Y | Y | 1 |
| 6 | ABC123-106 | COLD | COMMON COLD | MODERATE | Y | NOT RELATED | 25JUN2025 | . | ABC123 | Y | Y | N | N | 1 |
| 7 | ABC123-107 | HEAD | HEADACHE | SEVERE | Y | RELATED | 25JUN2025 | 02JUL2025 | ABC123 | N | Y | Y | Y | 1 |
| 8 | ABC123-109 | VOM | VOMITING | SEVERE | Y | RELATED | 09SEP2025 | 12SEP2025 | ABC123 | Y | Y | Y | Y | 1 |
| 9 | ABC123-112 | PYRX | PYREXIA | SEVERE | Y | RELATED | 05DEC2025 | . | ABC123 | Y | Y | Y | Y | 1 |
| 10 | ABC123-113 | VOM | VOMITING | MODERATE | Y | NOT RELATED | 02JAN2025 | 05JAN2025 | ABC123 | N | Y | N | N | 1 |
| 11 | ABC123-115 | HEAD | HEADACHE | SEVERE | Y | RELATED | 10MAR2025 | 15MAR2025 | ABC123 | N | Y | Y | Y | 1 |
Why?
Creates
subset for:
- Serious AE reporting
- Safety review
- Regulatory submission
Overall Clinical Trial Flow
DM ---> Subject Information
EX ---> Treatment Exposure
AE ---> Adverse Events
↓
ADSL ---> Subject-Level
Analysis Dataset
↓
ADAE ---> Adverse Event
Analysis Dataset
Real Industry Importance
This
project demonstrates:
- Raw data cleaning
- Date conversion
- Standardization
- Deduplication
- Merge logic
- BY-group processing
- ADaM derivations
- Clinical programming
workflow
These are
core interview topics for:
- SAS Programmer
- Clinical SAS Programmer
- Statistical Programmer
- ADaM Programmer
Key Interview Concepts
Covered
|
Topic |
Included |
|
PROC
SQL |
Yes |
|
MERGE |
Yes |
|
BY-group
processing |
Yes |
|
RETAIN |
Yes |
|
FIRST./LAST. |
Yes |
|
INPUT
function |
Yes |
|
ANYDTDTE
informat |
Yes |
|
Deduplication |
Yes |
|
Standardization |
Yes |
|
ADaM
derivation |
Yes |
|
Treatment
Emergent logic |
Yes |
|
Population
flags |
Yes |
20 Important Points About ADSL and ADAE Creation
1. DM
dataset is considered the parent dataset in clinical trials.
2. EX
dataset is used to derive treatment start and end dates.
3. AE
dataset stores adverse event information for safety analysis.
4. ADSL
contains one unique record per subject.
5. Duplicate
subjects must be removed before creating ADSL.
6. ANYDTDTE
informat helps convert multiple date formats into SAS dates.
7. TRTSDT
represents the first treatment exposure date.
8. TRTEDT
represents the last treatment exposure date.
9. Population
flags such as SAFFL and ITTFL are critical in analysis.
10. SEX values
must be standardized for consistency.
11. UPCASE and
STRIP functions are commonly used in data cleaning.
12. PROC SQL
helps summarize exposure information efficiently.
13. MERGE
statement combines datasets using common keys.
14. ADAE is an
analysis-ready adverse events dataset.
15. Treatment
Emergent Flag identifies events occurring after treatment start.
16. Serious
adverse events are identified using SAEFL.
17. Severity
levels are standardized into consistent categories.
18. PARAMCD and
PARAM are important ADaM variables for reporting.
19. BY-group
processing is essential for sequence derivation.
20. ADSL and
ADAE datasets are widely used for regulatory submissions and CSR reporting.
Summary
This clinical SAS programming project focused on transforming raw clinical
trial datasets into analysis-ready ADaM datasets using SAS programming
techniques. The project used three major domains: DM, EX, and AE. The DM
dataset provided subject-level demographic information, the EX dataset
contained treatment exposure details, and the AE dataset stored adverse event
information related to subject safety.
The first part of the project involved creating the ADSL dataset. Important
derivations such as TRTSDT, TRTEDT, SAFFL, ITTFL, EOSFL, and DTHFL were
developed using exposure and adverse event data. Data cleaning techniques such
as standardization, duplicate removal, and date conversion were performed to
ensure data quality and consistency.
The second part focused on creating the ADAE dataset. Adverse event dates
were converted into SAS date formats, severity and relationship variables were
standardized, and treatment emergent logic was derived using treatment start
dates from ADSL. Additional flags such as SAEFL, SEVFL, and RELFL were created
for safety reporting purposes.
Overall, the project demonstrated practical clinical trial programming
concepts widely used in pharmaceutical industries. It also showcased important
SAS skills such as PROC SQL, MERGE processing, BY-group logic, RETAIN
statements, and ADaM dataset derivations used in real-world regulatory
submissions.
Conclusion
The creation of ADSL and ADAE datasets is one of the most important
activities in clinical SAS programming because these datasets support
statistical analysis, safety evaluation, and regulatory reporting. In this
project, raw clinical trial datasets containing inconsistent formats, missing
values, and duplicate records were successfully transformed into standardized
ADaM datasets using SAS programming techniques.
The project highlighted the importance of data cleaning, date
standardization, deduplication, and derivation logic in clinical trials. By
deriving subject-level variables such as treatment start date, treatment end
date, and population flags in ADSL, the dataset became suitable for
subject-level analysis and reporting. Similarly, the ADAE dataset enabled
detailed adverse event analysis through derivation of treatment emergent flags,
serious adverse event flags, severity flags, and relationship indicators.
This project also demonstrated practical implementation of industry-standard
SAS concepts including PROC SQL, BY-group processing, RETAIN statements, INPUT
functions, and dataset merging. These techniques are commonly used by SAS
programmers working in pharmaceutical companies and CRO environments.
In conclusion, the project provides strong hands-on understanding of the
clinical data flow from raw datasets to analysis-ready ADaM datasets. It serves
as an excellent example for interview preparation, real-time project learning,
and understanding the workflow followed during clinical trial analysis and
submission processes.
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
About the Author:
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
About the Author:
SAS Learning Hub is a data analytics and SAS programming platform focused on clinical, financial, and real-world data analysis. The content is created by professionals with academic training in Pharmaceutics and hands-on experience in Base SAS, PROC SQL, Macros, SDTM, and ADaM, providing practical and industry-relevant SAS learning resources.
Disclaimer:
The datasets and analysis in this article are created for educational and demonstration purposes only. They do not represent REAL ADAM DATA.
Our Mission:
This blog provides industry-focused SAS programming tutorials and analytics projects covering finance, healthcare, and technology.
This project is suitable for:
· Students learning SAS
· Data analysts building portfolios
· Professionals preparing for SAS interviews
· Bloggers writing about analytics and Exams Reviewers and Observers
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Comments
Post a Comment