A Global Eye Bank Dataset into Trusted Intelligence Using SAS and R
Vision Without Validation Becomes Blind Analytics: Transforming a Global Eye Bank Dataset into Trusted Intelligence Using SAS and R
Introduction:From Corneal Donation Delays to
Analytical Failure
Imagine a
multinational ophthalmology research consortium preparing a global report on
corneal transplant accessibility. The study combines operational records from
eye banks across continents to evaluate donor availability, transplant waiting
periods, and regional healthcare inequality.
Three
weeks before executive presentation, statisticians discover a disaster.
Several
eye banks appear twice under slightly different names. Some donor counts are
negative because of data entry errors. Missing collection dates cause survival
analyses to fail. Regional labels appear as APAC, Apac, Asia-Pacific, and ASIA
PACIFIC simultaneously. A few email addresses contain spaces while others are
malformed entirely. Some eye banks report impossible utilization rates exceeding
100%.
The
dashboard shows misleading trends.
Machine
learning models generate incorrect forecasts.
Funding
decisions become unreliable.
Regulatory
submissions become questionable.
This is
not a software problem.
This is a
data quality problem.
Experienced
Clinical SAS Programmers understand that poor-quality source data can silently
destroy statistical conclusions long before model building begins.
In modern
healthcare analytics, data cleaning is not preprocessing.
It is
risk management.
Global Eye Banks Dataset
Overview
Our
analytical dataset contains information collected from major eye banks
worldwide.
Dataset Variables (9 Variables)
|
Variable |
Description |
|
EYE_BANK_ID |
Unique
eye bank identifier |
|
EYE_BANK_NAME |
Eye
bank name |
|
COUNTRY |
Country
location |
|
REGION_CODE |
Geographic
region |
|
DONORS_RECEIVED |
Annual
corneal donors |
|
TRANSPLANTS_DONE |
Annual
transplants |
|
AVG_WAIT_DAYS |
Average
waiting days |
|
CONTACT_EMAIL |
Operational
email |
|
LAST_AUDIT_DATE |
Last
regulatory audit date |
Intentional
corruption includes:
- Duplicate IDs
- Negative donor values
- Missing audit dates
- Invalid email addresses
- Incorrect region codes
- Mixed text case
- Leading spaces
- Impossible transplant counts
- NULL strings
- Invalid date values
1.SAS Raw Dataset Creation with Intentional Errors
data eyebank_raw;
infile datalines dsd dlm=',' truncover;
length eye_bank_id $8 eye_bank_name $45 country $25 region_code $20
contact_email $60;
informat last_audit_date anydtdte20.;
format last_audit_date date9.;
input eye_bank_id :$8. eye_bank_name :$45. country :$25. region_code :$20.
donors_received transplants_done avg_wait_days contact_email :$60.
last_audit_date :anydtdte20.;
datalines;
EB001,LionsEyeBank,India,APAC,450,420,32,support@lions.org,12JAN2025
EB002,VisionTrust,USA,NA,620,590,28,info@visiontrust.org,15FEB2025
EB003,GlobalEye,UK,EUROPE,-45,310,20,admin@globaleye.org,11MAR2025
EB003,GlobalEye,UK,Europe,500,310,20,admin@globaleye.org,11MAR2025
EB004,CorneaCare,India,APAC,520,700,18,NULL,.
EB005,BrightSight,Canada,na,410,390,25,help@brightsight,14APR2025
EB006,FutureVision,Australia,APAC,300,280,22,futurevision.org,30FEB2025
EB007,EyeLife,Germany,EU,290,275,-5,contact@eyelife.org,19MAY2025
EB008,WorldCornea,India,AsiaPacific,610,605,31,world@cornea.org,.
EB009,ClearView,Japan,APAC,510,490,27,clear@view.org,10JUN2025
EB010,VisionTrust,USA,NA,620,590,28,info@visiontrust.org,15FEB2025
EB011,Eye Hope,Brazil,LATAM,400,410,30,hope @eye.org,01JUL2025
EB012,RetinaGift,India,APAC,.,340,40,retina@gift.org,20AUG2025
EB013,NewLight,UAE,MEA,370,355,29,newlight@mail.com,15SEP2025
EB014,OcularBank,France,eu,280,270,23,ocular@bank.org,15OCT2025
EB015,DonorVision,India,APAC,530,520,26,donor@vision.org,22NOV2025
EB016,LifeSight,Kenya,AFRICA,250,240,34,sight@life.org,11DEC2025
EB017,GiftVision,India,APAC,0,230,36,gift@vision.org,.
EB018,HopeCornea,USA,NorthAmerica,550,530,21,hope@cornea.org,15JAN2025
EB019,EyeFuture,India,APAC,460,455,24,eyefuture.com,14FEB2025
EB020,GlobalCornea,Singapore,APAC,390,370,19,global@cornea.org,15MAR2025
EB021,BrightVision,India,APAC,440,435,33,bright@vision.org,01APR2025
EB022,ClearGift,UK,EUROPE,480,470,27,cleargift@eye.org,18MAY2025
;
run;
proc print data=eyebank_raw;
run;
OUTPUT:
| Obs | eye_bank_id | eye_bank_name | country | region_code | contact_email | last_audit_date | donors_received | transplants_done | avg_wait_days |
|---|---|---|---|---|---|---|---|---|---|
| 1 | EB001 | LionsEyeBank | India | APAC | support@lions.org | 12JAN2025 | 450 | 420 | 32 |
| 2 | EB002 | VisionTrust | USA | NA | info@visiontrust.org | 15FEB2025 | 620 | 590 | 28 |
| 3 | EB003 | GlobalEye | UK | EUROPE | admin@globaleye.org | 11MAR2025 | -45 | 310 | 20 |
| 4 | EB003 | GlobalEye | UK | Europe | admin@globaleye.org | 11MAR2025 | 500 | 310 | 20 |
| 5 | EB004 | CorneaCare | India | APAC | NULL | . | 520 | 700 | 18 |
| 6 | EB005 | BrightSight | Canada | na | help@brightsight | 14APR2025 | 410 | 390 | 25 |
| 7 | EB006 | FutureVision | Australia | APAC | futurevision.org | . | 300 | 280 | 22 |
| 8 | EB007 | EyeLife | Germany | EU | contact@eyelife.org | 19MAY2025 | 290 | 275 | -5 |
| 9 | EB008 | WorldCornea | India | AsiaPacific | world@cornea.org | . | 610 | 605 | 31 |
| 10 | EB009 | ClearView | Japan | APAC | clear@view.org | 10JUN2025 | 510 | 490 | 27 |
| 11 | EB010 | VisionTrust | USA | NA | info@visiontrust.org | 15FEB2025 | 620 | 590 | 28 |
| 12 | EB011 | Eye Hope | Brazil | LATAM | hope @eye.org | 01JUL2025 | 400 | 410 | 30 |
| 13 | EB012 | RetinaGift | India | APAC | retina@gift.org | 20AUG2025 | . | 340 | 40 |
| 14 | EB013 | NewLight | UAE | MEA | newlight@mail.com | 15SEP2025 | 370 | 355 | 29 |
| 15 | EB014 | OcularBank | France | eu | ocular@bank.org | 15OCT2025 | 280 | 270 | 23 |
| 16 | EB015 | DonorVision | India | APAC | donor@vision.org | 22NOV2025 | 530 | 520 | 26 |
| 17 | EB016 | LifeSight | Kenya | AFRICA | sight@life.org | 11DEC2025 | 250 | 240 | 34 |
| 18 | EB017 | GiftVision | India | APAC | gift@vision.org | . | 0 | 230 | 36 |
| 19 | EB018 | HopeCornea | USA | NorthAmerica | hope@cornea.org | 15JAN2025 | 550 | 530 | 21 |
| 20 | EB019 | EyeFuture | India | APAC | eyefuture.com | 14FEB2025 | 460 | 455 | 24 |
| 21 | EB020 | GlobalCornea | Singapore | APAC | global@cornea.org | 15MAR2025 | 390 | 370 | 19 |
| 22 | EB021 | BrightVision | India | APAC | bright@vision.org | 01APR2025 | 440 | 435 | 33 |
| 23 | EB022 | ClearGift | UK | EUROPE | cleargift@eye.org | 18MAY2025 | 480 | 470 | 27 |
Why LENGTH Must Come First
A common
production mistake is defining LENGTH after assignment statements.
eye_bank_name="International
Eye Donation Registry";
length eye_bank_name $20;
The
variable becomes permanently truncated because SAS allocates memory during
compilation.
This
creates silent corruption.
R behaves
differently because character vectors dynamically allocate memory rather than
using fixed-length storage definitions.
Clinical
production programmers always define LENGTH before assignment logic.
2.Cleaning Workflow Using DATA Step Programming
data eyebank_clean;
set eyebank_raw;
eye_bank_name=propcase(strip(eye_bank_name));
country=propcase(strip(country));
region_code=upcase(strip(region_code));
contact_email=compress(lowcase(contact_email),' ');
if donors_received <0 then donors_received=abs(donors_received);
if avg_wait_days<0 then avg_wait_days=.;
if missing(last_audit_date) then
last_audit_date=intnx('month',today(),-3,'same');
if index(contact_email,'@')=0 then
contact_email='unknown@placeholder.org';
if transplants_done>donors_received then
transplants_done=donors_received;
if region_code in ('EU','EUROPE') then region_code='EMEA';
select(region_code);
when('NA','NORTHAMERICA') region_code='NA';
when('APAC','ASIAPACIFIC') region_code='APAC';
otherwise;
end;
run;
proc print data=eyebank_clean;;
run;
OUTPUT:
| Obs | eye_bank_id | eye_bank_name | country | region_code | contact_email | last_audit_date | donors_received | transplants_done | avg_wait_days |
|---|---|---|---|---|---|---|---|---|---|
| 1 | EB001 | Lionseyebank | India | APAC | support@lions.org | 12JAN2025 | 450 | 420 | 32 |
| 2 | EB002 | Visiontrust | Usa | NA | info@visiontrust.org | 15FEB2025 | 620 | 590 | 28 |
| 3 | EB003 | Globaleye | Uk | EMEA | admin@globaleye.org | 11MAR2025 | 45 | 45 | 20 |
| 4 | EB003 | Globaleye | Uk | EMEA | admin@globaleye.org | 11MAR2025 | 500 | 310 | 20 |
| 5 | EB004 | Corneacare | India | APAC | unknown@placeholder.org | 04APR2026 | 520 | 520 | 18 |
| 6 | EB005 | Brightsight | Canada | NA | help@brightsight | 14APR2025 | 410 | 390 | 25 |
| 7 | EB006 | Futurevision | Australia | APAC | unknown@placeholder.org | 04APR2026 | 300 | 280 | 22 |
| 8 | EB007 | Eyelife | Germany | EMEA | contact@eyelife.org | 19MAY2025 | 290 | 275 | . |
| 9 | EB008 | Worldcornea | India | APAC | world@cornea.org | 04APR2026 | 610 | 605 | 31 |
| 10 | EB009 | Clearview | Japan | APAC | clear@view.org | 10JUN2025 | 510 | 490 | 27 |
| 11 | EB010 | Visiontrust | Usa | NA | info@visiontrust.org | 15FEB2025 | 620 | 590 | 28 |
| 12 | EB011 | Eye Hope | Brazil | LATAM | hope@eye.org | 01JUL2025 | 400 | 400 | 30 |
| 13 | EB012 | Retinagift | India | APAC | retina@gift.org | 20AUG2025 | . | . | 40 |
| 14 | EB013 | Newlight | Uae | MEA | newlight@mail.com | 15SEP2025 | 370 | 355 | 29 |
| 15 | EB014 | Ocularbank | France | EMEA | ocular@bank.org | 15OCT2025 | 280 | 270 | 23 |
| 16 | EB015 | Donorvision | India | APAC | donor@vision.org | 22NOV2025 | 530 | 520 | 26 |
| 17 | EB016 | Lifesight | Kenya | AFRICA | sight@life.org | 11DEC2025 | 250 | 240 | 34 |
| 18 | EB017 | Giftvision | India | APAC | gift@vision.org | 04APR2026 | 0 | 0 | 36 |
| 19 | EB018 | Hopecornea | Usa | NA | hope@cornea.org | 15JAN2025 | 550 | 530 | 21 |
| 20 | EB019 | Eyefuture | India | APAC | unknown@placeholder.org | 14FEB2025 | 460 | 455 | 24 |
| 21 | EB020 | Globalcornea | Singapore | APAC | global@cornea.org | 15MAR2025 | 390 | 370 | 19 |
| 22 | EB021 | Brightvision | India | APAC | bright@vision.org | 01APR2025 | 440 | 435 | 33 |
| 23 | EB022 | Cleargift | Uk | EMEA | cleargift@eye.org | 18MAY2025 | 480 | 470 | 27 |
Explanation
This
workflow combines multiple defensive programming approaches used in production
clinical environments. PROPCASE standardizes presentation layers while STRIP
removes hidden whitespace corruption. ABS corrects impossible negative
operational values without destroying historical traceability. INTNX generates
replacement dates using reproducible business rules. INDEX validates email
structure quickly without expensive regular expressions. SELECT-WHEN logic
improves readability compared with nested IF blocks and simplifies maintenance
during production deployment.
3.Deduplication Using PROC SORT
proc sort data=eyebank_clean out=audit_reference nodupkey;
by eye_bank_id;
run;
proc print data=audit_reference;
run;
OUTPUT:
OUTPUT:
| Obs | eye_bank_id | eye_bank_name | country | region_code | contact_email | last_audit_date | donors_received | transplants_done | avg_wait_days |
|---|---|---|---|---|---|---|---|---|---|
| 1 | EB001 | Lionseyebank | India | APAC | support@lions.org | 12JAN2025 | 450 | 420 | 32 |
| 2 | EB002 | Visiontrust | Usa | NA | info@visiontrust.org | 15FEB2025 | 620 | 590 | 28 |
| 3 | EB003 | Globaleye | Uk | EMEA | admin@globaleye.org | 11MAR2025 | 45 | 45 | 20 |
| 4 | EB004 | Corneacare | India | APAC | unknown@placeholder.org | 04APR2026 | 520 | 520 | 18 |
| 5 | EB005 | Brightsight | Canada | NA | help@brightsight | 14APR2025 | 410 | 390 | 25 |
| 6 | EB006 | Futurevision | Australia | APAC | unknown@placeholder.org | 04APR2026 | 300 | 280 | 22 |
| 7 | EB007 | Eyelife | Germany | EMEA | contact@eyelife.org | 19MAY2025 | 290 | 275 | . |
| 8 | EB008 | Worldcornea | India | APAC | world@cornea.org | 04APR2026 | 610 | 605 | 31 |
| 9 | EB009 | Clearview | Japan | APAC | clear@view.org | 10JUN2025 | 510 | 490 | 27 |
| 10 | EB010 | Visiontrust | Usa | NA | info@visiontrust.org | 15FEB2025 | 620 | 590 | 28 |
| 11 | EB011 | Eye Hope | Brazil | LATAM | hope@eye.org | 01JUL2025 | 400 | 400 | 30 |
| 12 | EB012 | Retinagift | India | APAC | retina@gift.org | 20AUG2025 | . | . | 40 |
| 13 | EB013 | Newlight | Uae | MEA | newlight@mail.com | 15SEP2025 | 370 | 355 | 29 |
| 14 | EB014 | Ocularbank | France | EMEA | ocular@bank.org | 15OCT2025 | 280 | 270 | 23 |
| 15 | EB015 | Donorvision | India | APAC | donor@vision.org | 22NOV2025 | 530 | 520 | 26 |
| 16 | EB016 | Lifesight | Kenya | AFRICA | sight@life.org | 11DEC2025 | 250 | 240 | 34 |
| 17 | EB017 | Giftvision | India | APAC | gift@vision.org | 04APR2026 | 0 | 0 | 36 |
| 18 | EB018 | Hopecornea | Usa | NA | hope@cornea.org | 15JAN2025 | 550 | 530 | 21 |
| 19 | EB019 | Eyefuture | India | APAC | unknown@placeholder.org | 14FEB2025 | 460 | 455 | 24 |
| 20 | EB020 | Globalcornea | Singapore | APAC | global@cornea.org | 15MAR2025 | 390 | 370 | 19 |
| 21 | EB021 | Brightvision | India | APAC | bright@vision.org | 01APR2025 | 440 | 435 | 33 |
| 22 | EB022 | Cleargift | Uk | EMEA | cleargift@eye.org | 18MAY2025 | 480 | 470 | 27 |
Explanation
PROC SORT
with NODUPKEY removes duplicate primary identifiers while preserving first
occurrence records. This method is extremely common in SDTM and ADaM
preparation pipelines. Analysts must document whether first occurrence
retention aligns with business rules because inappropriate deduplication can create
regulatory traceability concerns.
4.DATA Step Versus PROC SQL
proc sort data=eyebank_clean;
by eye_bank_id;
run;
proc print data=eyebank_clean;
run;
OUTPUT:
| Obs | eye_bank_id | eye_bank_name | country | region_code | contact_email | last_audit_date | donors_received | transplants_done | avg_wait_days |
|---|---|---|---|---|---|---|---|---|---|
| 1 | EB001 | Lionseyebank | India | APAC | support@lions.org | 12JAN2025 | 450 | 420 | 32 |
| 2 | EB002 | Visiontrust | Usa | NA | info@visiontrust.org | 15FEB2025 | 620 | 590 | 28 |
| 3 | EB003 | Globaleye | Uk | EMEA | admin@globaleye.org | 11MAR2025 | 45 | 45 | 20 |
| 4 | EB003 | Globaleye | Uk | EMEA | admin@globaleye.org | 11MAR2025 | 500 | 310 | 20 |
| 5 | EB004 | Corneacare | India | APAC | unknown@placeholder.org | 04APR2026 | 520 | 520 | 18 |
| 6 | EB005 | Brightsight | Canada | NA | help@brightsight | 14APR2025 | 410 | 390 | 25 |
| 7 | EB006 | Futurevision | Australia | APAC | unknown@placeholder.org | 04APR2026 | 300 | 280 | 22 |
| 8 | EB007 | Eyelife | Germany | EMEA | contact@eyelife.org | 19MAY2025 | 290 | 275 | . |
| 9 | EB008 | Worldcornea | India | APAC | world@cornea.org | 04APR2026 | 610 | 605 | 31 |
| 10 | EB009 | Clearview | Japan | APAC | clear@view.org | 10JUN2025 | 510 | 490 | 27 |
| 11 | EB010 | Visiontrust | Usa | NA | info@visiontrust.org | 15FEB2025 | 620 | 590 | 28 |
| 12 | EB011 | Eye Hope | Brazil | LATAM | hope@eye.org | 01JUL2025 | 400 | 400 | 30 |
| 13 | EB012 | Retinagift | India | APAC | retina@gift.org | 20AUG2025 | . | . | 40 |
| 14 | EB013 | Newlight | Uae | MEA | newlight@mail.com | 15SEP2025 | 370 | 355 | 29 |
| 15 | EB014 | Ocularbank | France | EMEA | ocular@bank.org | 15OCT2025 | 280 | 270 | 23 |
| 16 | EB015 | Donorvision | India | APAC | donor@vision.org | 22NOV2025 | 530 | 520 | 26 |
| 17 | EB016 | Lifesight | Kenya | AFRICA | sight@life.org | 11DEC2025 | 250 | 240 | 34 |
| 18 | EB017 | Giftvision | India | APAC | gift@vision.org | 04APR2026 | 0 | 0 | 36 |
| 19 | EB018 | Hopecornea | Usa | NA | hope@cornea.org | 15JAN2025 | 550 | 530 | 21 |
| 20 | EB019 | Eyefuture | India | APAC | unknown@placeholder.org | 14FEB2025 | 460 | 455 | 24 |
| 21 | EB020 | Globalcornea | Singapore | APAC | global@cornea.org | 15MAR2025 | 390 | 370 | 19 |
| 22 | EB021 | Brightvision | India | APAC | bright@vision.org | 01APR2025 | 440 | 435 | 33 |
| 23 | EB022 | Cleargift | Uk | EMEA | cleargift@eye.org | 18MAY2025 | 480 | 470 | 27 |
4.1 DATA Step Merge
data audit_reference;
length eye_bank_id $8
audit_status $20;
input eye_bank_id $ audit_status $;
datalines;
EB001 PASS
EB002 PASS
EB003 REVIEW
EB004 FAIL
EB005 PASS
EB006 REVIEW
EB007 PASS
EB008 REVIEW
EB009 PASS
EB010 PASS
EB011 FAIL
EB012 REVIEW
EB013 PASS
EB014 PASS
EB015 PASS
EB016 REVIEW
EB017 FAIL
EB018 PASS
EB019 REVIEW
EB020 PASS
EB021 PASS
EB022 PASS
;
run;
proc print data=audit_reference;
run;
OUTPUT:
| Obs | eye_bank_id | audit_status |
|---|---|---|
| 1 | EB001 | PASS |
| 2 | EB002 | PASS |
| 3 | EB003 | REVIEW |
| 4 | EB004 | FAIL |
| 5 | EB005 | PASS |
| 6 | EB006 | REVIEW |
| 7 | EB007 | PASS |
| 8 | EB008 | REVIEW |
| 9 | EB009 | PASS |
| 10 | EB010 | PASS |
| 11 | EB011 | FAIL |
| 12 | EB012 | REVIEW |
| 13 | EB013 | PASS |
| 14 | EB014 | PASS |
| 15 | EB015 | PASS |
| 16 | EB016 | REVIEW |
| 17 | EB017 | FAIL |
| 18 | EB018 | PASS |
| 19 | EB019 | REVIEW |
| 20 | EB020 | PASS |
| 21 | EB021 | PASS |
| 22 | EB022 | PASS |
data final_ds;
merge eyebank_clean(in=a)
audit_reference(in=b);
by eye_bank_id;
if a;
run;
proc print data=final_ds;
run;
OUTPUT:
| Obs | eye_bank_id | eye_bank_name | country | region_code | contact_email | last_audit_date | donors_received | transplants_done | avg_wait_days | audit_status |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | EB001 | Lionseyebank | India | APAC | support@lions.org | 12JAN2025 | 450 | 420 | 32 | PASS |
| 2 | EB002 | Visiontrust | Usa | NA | info@visiontrust.org | 15FEB2025 | 620 | 590 | 28 | PASS |
| 3 | EB003 | Globaleye | Uk | EMEA | admin@globaleye.org | 11MAR2025 | 45 | 45 | 20 | REVIEW |
| 4 | EB003 | Globaleye | Uk | EMEA | admin@globaleye.org | 11MAR2025 | 500 | 310 | 20 | REVIEW |
| 5 | EB004 | Corneacare | India | APAC | unknown@placeholder.org | 04APR2026 | 520 | 520 | 18 | FAIL |
| 6 | EB005 | Brightsight | Canada | NA | help@brightsight | 14APR2025 | 410 | 390 | 25 | PASS |
| 7 | EB006 | Futurevision | Australia | APAC | unknown@placeholder.org | 04APR2026 | 300 | 280 | 22 | REVIEW |
| 8 | EB007 | Eyelife | Germany | EMEA | contact@eyelife.org | 19MAY2025 | 290 | 275 | . | PASS |
| 9 | EB008 | Worldcornea | India | APAC | world@cornea.org | 04APR2026 | 610 | 605 | 31 | REVIEW |
| 10 | EB009 | Clearview | Japan | APAC | clear@view.org | 10JUN2025 | 510 | 490 | 27 | PASS |
| 11 | EB010 | Visiontrust | Usa | NA | info@visiontrust.org | 15FEB2025 | 620 | 590 | 28 | PASS |
| 12 | EB011 | Eye Hope | Brazil | LATAM | hope@eye.org | 01JUL2025 | 400 | 400 | 30 | FAIL |
| 13 | EB012 | Retinagift | India | APAC | retina@gift.org | 20AUG2025 | . | . | 40 | REVIEW |
| 14 | EB013 | Newlight | Uae | MEA | newlight@mail.com | 15SEP2025 | 370 | 355 | 29 | PASS |
| 15 | EB014 | Ocularbank | France | EMEA | ocular@bank.org | 15OCT2025 | 280 | 270 | 23 | PASS |
| 16 | EB015 | Donorvision | India | APAC | donor@vision.org | 22NOV2025 | 530 | 520 | 26 | PASS |
| 17 | EB016 | Lifesight | Kenya | AFRICA | sight@life.org | 11DEC2025 | 250 | 240 | 34 | REVIEW |
| 18 | EB017 | Giftvision | India | APAC | gift@vision.org | 04APR2026 | 0 | 0 | 36 | FAIL |
| 19 | EB018 | Hopecornea | Usa | NA | hope@cornea.org | 15JAN2025 | 550 | 530 | 21 | PASS |
| 20 | EB019 | Eyefuture | India | APAC | unknown@placeholder.org | 14FEB2025 | 460 | 455 | 24 | REVIEW |
| 21 | EB020 | Globalcornea | Singapore | APAC | global@cornea.org | 15MAR2025 | 390 | 370 | 19 | PASS |
| 22 | EB021 | Brightvision | India | APAC | bright@vision.org | 01APR2025 | 440 | 435 | 33 | PASS |
| 23 | EB022 | Cleargift | Uk | EMEA | cleargift@eye.org | 18MAY2025 | 480 | 470 | 27 | PASS |
4.2 PROC SQL Join
proc sql;
create table final_sql as
select a.*,
b.audit_status
from eyebank_clean as a
left join audit_reference as b
on a.eye_bank_id=b.eye_bank_id;
quit;
proc print data=final_sql;
run;
OUTPUT:
| Obs | eye_bank_id | eye_bank_name | country | region_code | contact_email | last_audit_date | donors_received | transplants_done | avg_wait_days | audit_status |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | EB001 | Lionseyebank | India | APAC | support@lions.org | 12JAN2025 | 450 | 420 | 32 | PASS |
| 2 | EB002 | Visiontrust | Usa | NA | info@visiontrust.org | 15FEB2025 | 620 | 590 | 28 | PASS |
| 3 | EB003 | Globaleye | Uk | EMEA | admin@globaleye.org | 11MAR2025 | 45 | 45 | 20 | REVIEW |
| 4 | EB003 | Globaleye | Uk | EMEA | admin@globaleye.org | 11MAR2025 | 500 | 310 | 20 | REVIEW |
| 5 | EB004 | Corneacare | India | APAC | unknown@placeholder.org | 04APR2026 | 520 | 520 | 18 | FAIL |
| 6 | EB005 | Brightsight | Canada | NA | help@brightsight | 14APR2025 | 410 | 390 | 25 | PASS |
| 7 | EB006 | Futurevision | Australia | APAC | unknown@placeholder.org | 04APR2026 | 300 | 280 | 22 | REVIEW |
| 8 | EB007 | Eyelife | Germany | EMEA | contact@eyelife.org | 19MAY2025 | 290 | 275 | . | PASS |
| 9 | EB008 | Worldcornea | India | APAC | world@cornea.org | 04APR2026 | 610 | 605 | 31 | REVIEW |
| 10 | EB009 | Clearview | Japan | APAC | clear@view.org | 10JUN2025 | 510 | 490 | 27 | PASS |
| 11 | EB010 | Visiontrust | Usa | NA | info@visiontrust.org | 15FEB2025 | 620 | 590 | 28 | PASS |
| 12 | EB011 | Eye Hope | Brazil | LATAM | hope@eye.org | 01JUL2025 | 400 | 400 | 30 | FAIL |
| 13 | EB012 | Retinagift | India | APAC | retina@gift.org | 20AUG2025 | . | . | 40 | REVIEW |
| 14 | EB013 | Newlight | Uae | MEA | newlight@mail.com | 15SEP2025 | 370 | 355 | 29 | PASS |
| 15 | EB014 | Ocularbank | France | EMEA | ocular@bank.org | 15OCT2025 | 280 | 270 | 23 | PASS |
| 16 | EB015 | Donorvision | India | APAC | donor@vision.org | 22NOV2025 | 530 | 520 | 26 | PASS |
| 17 | EB016 | Lifesight | Kenya | AFRICA | sight@life.org | 11DEC2025 | 250 | 240 | 34 | REVIEW |
| 18 | EB017 | Giftvision | India | APAC | gift@vision.org | 04APR2026 | 0 | 0 | 36 | FAIL |
| 19 | EB018 | Hopecornea | Usa | NA | hope@cornea.org | 15JAN2025 | 550 | 530 | 21 | PASS |
| 20 | EB019 | Eyefuture | India | APAC | unknown@placeholder.org | 14FEB2025 | 460 | 455 | 24 | REVIEW |
| 21 | EB020 | Globalcornea | Singapore | APAC | global@cornea.org | 15MAR2025 | 390 | 370 | 19 | PASS |
| 22 | EB021 | Brightvision | India | APAC | bright@vision.org | 01APR2025 | 440 | 435 | 33 | PASS |
| 23 | EB022 | Cleargift | Uk | EMEA | cleargift@eye.org | 18MAY2025 | 480 | 470 | 27 | PASS |
Comparison
DATA step
merges are faster for sorted datasets and provide FIRST./LAST. functionality.
PROC SQL offers superior readability for complex joins and aggregations.
Enterprise teams typically use both depending on workflow requirements.
5.FIRST. and LAST. Processing
proc sort data=final_ds;
by country;
run;
proc print data=final_ds;
run;
OUTPUT:
| Obs | eye_bank_id | eye_bank_name | country | region_code | contact_email | last_audit_date | donors_received | transplants_done | avg_wait_days | audit_status |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | EB006 | Futurevision | Australia | APAC | unknown@placeholder.org | 04APR2026 | 300 | 280 | 22 | REVIEW |
| 2 | EB011 | Eye Hope | Brazil | LATAM | hope@eye.org | 01JUL2025 | 400 | 400 | 30 | FAIL |
| 3 | EB005 | Brightsight | Canada | NA | help@brightsight | 14APR2025 | 410 | 390 | 25 | PASS |
| 4 | EB014 | Ocularbank | France | EMEA | ocular@bank.org | 15OCT2025 | 280 | 270 | 23 | PASS |
| 5 | EB007 | Eyelife | Germany | EMEA | contact@eyelife.org | 19MAY2025 | 290 | 275 | . | PASS |
| 6 | EB001 | Lionseyebank | India | APAC | support@lions.org | 12JAN2025 | 450 | 420 | 32 | PASS |
| 7 | EB004 | Corneacare | India | APAC | unknown@placeholder.org | 04APR2026 | 520 | 520 | 18 | FAIL |
| 8 | EB008 | Worldcornea | India | APAC | world@cornea.org | 04APR2026 | 610 | 605 | 31 | REVIEW |
| 9 | EB012 | Retinagift | India | APAC | retina@gift.org | 20AUG2025 | . | . | 40 | REVIEW |
| 10 | EB015 | Donorvision | India | APAC | donor@vision.org | 22NOV2025 | 530 | 520 | 26 | PASS |
| 11 | EB017 | Giftvision | India | APAC | gift@vision.org | 04APR2026 | 0 | 0 | 36 | FAIL |
| 12 | EB019 | Eyefuture | India | APAC | unknown@placeholder.org | 14FEB2025 | 460 | 455 | 24 | REVIEW |
| 13 | EB021 | Brightvision | India | APAC | bright@vision.org | 01APR2025 | 440 | 435 | 33 | PASS |
| 14 | EB009 | Clearview | Japan | APAC | clear@view.org | 10JUN2025 | 510 | 490 | 27 | PASS |
| 15 | EB016 | Lifesight | Kenya | AFRICA | sight@life.org | 11DEC2025 | 250 | 240 | 34 | REVIEW |
| 16 | EB020 | Globalcornea | Singapore | APAC | global@cornea.org | 15MAR2025 | 390 | 370 | 19 | PASS |
| 17 | EB013 | Newlight | Uae | MEA | newlight@mail.com | 15SEP2025 | 370 | 355 | 29 | PASS |
| 18 | EB003 | Globaleye | Uk | EMEA | admin@globaleye.org | 11MAR2025 | 45 | 45 | 20 | REVIEW |
| 19 | EB003 | Globaleye | Uk | EMEA | admin@globaleye.org | 11MAR2025 | 500 | 310 | 20 | REVIEW |
| 20 | EB022 | Cleargift | Uk | EMEA | cleargift@eye.org | 18MAY2025 | 480 | 470 | 27 | PASS |
| 21 | EB002 | Visiontrust | Usa | NA | info@visiontrust.org | 15FEB2025 | 620 | 590 | 28 | PASS |
| 22 | EB010 | Visiontrust | Usa | NA | info@visiontrust.org | 15FEB2025 | 620 | 590 | 28 | PASS |
| 23 | EB018 | Hopecornea | Usa | NA | hope@cornea.org | 15JAN2025 | 550 | 530 | 21 | PASS |
data country_summary;
set final_ds;
by country;
retain total_donors 0;
if first.country then total_donors=0;
total_donors+donors_received;
if last.country then output;
run;
proc print data=country_summary;
run;
OUTPUT:
| Obs | eye_bank_id | eye_bank_name | country | region_code | contact_email | last_audit_date | donors_received | transplants_done | avg_wait_days | audit_status | total_donors |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | EB006 | Futurevision | Australia | APAC | unknown@placeholder.org | 04APR2026 | 300 | 280 | 22 | REVIEW | 300 |
| 2 | EB011 | Eye Hope | Brazil | LATAM | hope@eye.org | 01JUL2025 | 400 | 400 | 30 | FAIL | 400 |
| 3 | EB005 | Brightsight | Canada | NA | help@brightsight | 14APR2025 | 410 | 390 | 25 | PASS | 410 |
| 4 | EB014 | Ocularbank | France | EMEA | ocular@bank.org | 15OCT2025 | 280 | 270 | 23 | PASS | 280 |
| 5 | EB007 | Eyelife | Germany | EMEA | contact@eyelife.org | 19MAY2025 | 290 | 275 | . | PASS | 290 |
| 6 | EB021 | Brightvision | India | APAC | bright@vision.org | 01APR2025 | 440 | 435 | 33 | PASS | 3010 |
| 7 | EB009 | Clearview | Japan | APAC | clear@view.org | 10JUN2025 | 510 | 490 | 27 | PASS | 510 |
| 8 | EB016 | Lifesight | Kenya | AFRICA | sight@life.org | 11DEC2025 | 250 | 240 | 34 | REVIEW | 250 |
| 9 | EB020 | Globalcornea | Singapore | APAC | global@cornea.org | 15MAR2025 | 390 | 370 | 19 | PASS | 390 |
| 10 | EB013 | Newlight | Uae | MEA | newlight@mail.com | 15SEP2025 | 370 | 355 | 29 | PASS | 370 |
| 11 | EB022 | Cleargift | Uk | EMEA | cleargift@eye.org | 18MAY2025 | 480 | 470 | 27 | PASS | 1025 |
| 12 | EB018 | Hopecornea | Usa | NA | hope@cornea.org | 15JAN2025 | 550 | 530 | 21 | PASS | 1790 |
Explanation
FIRST.
and LAST. variables are among the most powerful SAS capabilities unavailable in
standard SQL implementations. These automatic indicators support cumulative
calculations, patient visit sequencing, exposure duration calculations, and
longitudinal clinical analysis.
6.ARRAY Processing Example
data num_arrays;
set final_ds;
array nums donors_received
transplants_done
avg_wait_days;
do i=1 to dim(nums);
if nums(i)<0 then nums(i)=abs(nums(i));
end;
run;
proc print data=num_arrays;
run;
OUTPUT:
| Obs | eye_bank_id | eye_bank_name | country | region_code | contact_email | last_audit_date | donors_received | transplants_done | avg_wait_days | audit_status | i |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | EB006 | Futurevision | Australia | APAC | unknown@placeholder.org | 04APR2026 | 300 | 280 | 22 | REVIEW | 4 |
| 2 | EB011 | Eye Hope | Brazil | LATAM | hope@eye.org | 01JUL2025 | 400 | 400 | 30 | FAIL | 4 |
| 3 | EB005 | Brightsight | Canada | NA | help@brightsight | 14APR2025 | 410 | 390 | 25 | PASS | 4 |
| 4 | EB014 | Ocularbank | France | EMEA | ocular@bank.org | 15OCT2025 | 280 | 270 | 23 | PASS | 4 |
| 5 | EB007 | Eyelife | Germany | EMEA | contact@eyelife.org | 19MAY2025 | 290 | 275 | . | PASS | 4 |
| 6 | EB001 | Lionseyebank | India | APAC | support@lions.org | 12JAN2025 | 450 | 420 | 32 | PASS | 4 |
| 7 | EB004 | Corneacare | India | APAC | unknown@placeholder.org | 04APR2026 | 520 | 520 | 18 | FAIL | 4 |
| 8 | EB008 | Worldcornea | India | APAC | world@cornea.org | 04APR2026 | 610 | 605 | 31 | REVIEW | 4 |
| 9 | EB012 | Retinagift | India | APAC | retina@gift.org | 20AUG2025 | . | . | 40 | REVIEW | 4 |
| 10 | EB015 | Donorvision | India | APAC | donor@vision.org | 22NOV2025 | 530 | 520 | 26 | PASS | 4 |
| 11 | EB017 | Giftvision | India | APAC | gift@vision.org | 04APR2026 | 0 | 0 | 36 | FAIL | 4 |
| 12 | EB019 | Eyefuture | India | APAC | unknown@placeholder.org | 14FEB2025 | 460 | 455 | 24 | REVIEW | 4 |
| 13 | EB021 | Brightvision | India | APAC | bright@vision.org | 01APR2025 | 440 | 435 | 33 | PASS | 4 |
| 14 | EB009 | Clearview | Japan | APAC | clear@view.org | 10JUN2025 | 510 | 490 | 27 | PASS | 4 |
| 15 | EB016 | Lifesight | Kenya | AFRICA | sight@life.org | 11DEC2025 | 250 | 240 | 34 | REVIEW | 4 |
| 16 | EB020 | Globalcornea | Singapore | APAC | global@cornea.org | 15MAR2025 | 390 | 370 | 19 | PASS | 4 |
| 17 | EB013 | Newlight | Uae | MEA | newlight@mail.com | 15SEP2025 | 370 | 355 | 29 | PASS | 4 |
| 18 | EB003 | Globaleye | Uk | EMEA | admin@globaleye.org | 11MAR2025 | 45 | 45 | 20 | REVIEW | 4 |
| 19 | EB003 | Globaleye | Uk | EMEA | admin@globaleye.org | 11MAR2025 | 500 | 310 | 20 | REVIEW | 4 |
| 20 | EB022 | Cleargift | Uk | EMEA | cleargift@eye.org | 18MAY2025 | 480 | 470 | 27 | PASS | 4 |
| 21 | EB002 | Visiontrust | Usa | NA | info@visiontrust.org | 15FEB2025 | 620 | 590 | 28 | PASS | 4 |
| 22 | EB010 | Visiontrust | Usa | NA | info@visiontrust.org | 15FEB2025 | 620 | 590 | 28 | PASS | 4 |
| 23 | EB018 | Hopecornea | Usa | NA | hope@cornea.org | 15JAN2025 | 550 | 530 | 21 | PASS | 4 |
Explanation
ARRAY
processing reduces repetitive code and improves maintainability. In production
SDTM environments, arrays often validate hundreds of laboratory parameters
simultaneously.
7.Professional Reporting
proc report data=eyebank_clean nowd;
column country donors_received
transplants_done
avg_wait_days;
define country/group;
define donors_received/sum;
define transplants_done/sum;
define avg_wait_days/mean;
run;
OUTPUT:
| country | donors_received | transplants_done | avg_wait_days |
|---|---|---|---|
| Australia | 300 | 280 | 22 |
| Brazil | 400 | 400 | 30 |
| Canada | 410 | 390 | 25 |
| France | 280 | 270 | 23 |
| Germany | 290 | 275 | . |
| India | 3010 | 2955 | 30 |
| Japan | 510 | 490 | 27 |
| Kenya | 250 | 240 | 34 |
| Singapore | 390 | 370 | 19 |
| Uae | 370 | 355 | 29 |
| Uk | 1025 | 825 | 22.333333 |
| Usa | 1790 | 1710 | 25.666667 |
Explanation
PROC
REPORT remains one of the most powerful reporting procedures in SAS. Clinical
TLF generation frequently relies on REPORT because of its flexible compute
blocks and formatting options.
8.Reusable Macro Example
%macro check_missing(ds,var);
proc sql;
select count(*) as missing_count
from &ds
where missing(&var);
quit;
%mend;
%check_missing(eyebank_clean,contact_email);
OUTPUT:
| missing_count |
|---|
| 0 |
Explanation
Macros
improve standardization and reproducibility. Regulatory organizations strongly
prefer reusable validated components rather than duplicated business logic.
9.R Raw Dataset
library(tibble)
eyebank_raw <- tibble(
eye_bank_id = c("EB001","EB002","EB003","EB003","EB004","EB005","EB006",
"EB007","EB008","EB009","EB010","EB011","EB012","EB013","EB014",
"EB015","EB016","EB017","EB018","EB019","EB020","EB021","EB022"),
eye_bank_name = c("LionsEyeBank","VisionTrust","GlobalEye","GlobalEye",
"CorneaCare","BrightSight","FutureVision","EyeLife","WorldCornea",
"ClearView","VisionTrust","Eye Hope","RetinaGift","NewLight",
"OcularBank","DonorVision","LifeSight","GiftVision","HopeCornea",
"EyeFuture","GlobalCornea","BrightVision","ClearGift"),
country = c("India","USA","UK","UK","India","Canada","Australia",
"Germany","India","Japan","USA","Brazil","India","UAE","France",
"India","Kenya","India","USA","India","Singapore","India","UK"),
region_code = c("APAC","NA","EUROPE","Europe","APAC","na","APAC",
"EU","AsiaPacific","APAC","NA","LATAM","APAC","MEA","eu","APAC",
"AFRICA","APAC","NorthAmerica","APAC","APAC","APAC","EUROPE"),
donors_received = c(450,620,-45,500,520,410,300,290,610,510,620,400,
NA,370,280,530,250,0,550,460,390,440,480),
transplants_done = c(420,590,310,310,700,390,280,275,605,490,590,410,
340,355,270,520,240,230,530,455,370,435,470),
avg_wait_days = c(32,28,20,20,18,25,22,-5,31,27,28,30,40,29,
23,26,34,36,21,24,19,33,27),
contact_email = c("support@lions.org","info@visiontrust.org",
"admin@globaleye.org","admin@globaleye.org","NULL","help@brightsight",
"futurevision.org","contact@eyelife.org","world@cornea.org","clear@view.org",
"info@visiontrust.org","hope @eye.org","retina@gift.org","newlight@mail.com",
"ocular@bank.org","donor@vision.org","sight@life.org","gift@vision.org",
"hope@cornea.org","eyefuture.com","global@cornea.org","bright@vision.org",
"cleargift@eye.org"),
last_audit_date = c("12JAN2025","15FEB2025","11MAR2025","11MAR2025",".",
"14APR2025","30FEB2025","19MAY2025",".","10JUN2025","15FEB2025","01JUL2025",
"20AUG2025","15SEP2025","15OCT2025","22NOV2025","11DEC2025",".","15JAN2025",
"14FEB2025","15MAR2025","01APR2025","18MAY2025")
)
OUTPUT:
|
eye_bank_id |
eye_bank_name |
country |
region_code |
donors_received |
transplants_done |
avg_wait_days |
contact_email |
last_audit_date |
|
EB001 |
LionsEyeBank |
India |
APAC |
450 |
420 |
32 |
support@lions.org |
12JAN2025 |
|
EB002 |
VisionTrust |
USA |
NA |
620 |
590 |
28 |
info@visiontrust.org |
15FEB2025 |
|
EB003 |
GlobalEye |
UK |
EUROPE |
-45 |
310 |
20 |
admin@globaleye.org |
11MAR2025 |
|
EB003 |
GlobalEye |
UK |
Europe |
500 |
310 |
20 |
admin@globaleye.org |
11MAR2025 |
|
EB004 |
CorneaCare |
India |
APAC |
520 |
700 |
18 |
NULL |
. |
|
EB005 |
BrightSight |
Canada |
na |
410 |
390 |
25 |
help@brightsight |
14APR2025 |
|
EB006 |
FutureVision |
Australia |
APAC |
300 |
280 |
22 |
futurevision.org |
30FEB2025 |
|
EB007 |
EyeLife |
Germany |
EU |
290 |
275 |
-5 |
contact@eyelife.org |
19MAY2025 |
|
EB008 |
WorldCornea |
India |
AsiaPacific |
610 |
605 |
31 |
world@cornea.org |
. |
|
EB009 |
ClearView |
Japan |
APAC |
510 |
490 |
27 |
clear@view.org |
10JUN2025 |
|
EB010 |
VisionTrust |
USA |
NA |
620 |
590 |
28 |
info@visiontrust.org |
15FEB2025 |
|
EB011 |
Eye Hope |
Brazil |
LATAM |
400 |
410 |
30 |
hope @eye.org |
01JUL2025 |
|
EB012 |
RetinaGift |
India |
APAC |
340 |
40 |
retina@gift.org |
20AUG2025 |
|
|
EB013 |
NewLight |
UAE |
MEA |
370 |
355 |
29 |
newlight@mail.com |
15SEP2025 |
|
EB014 |
OcularBank |
France |
eu |
280 |
270 |
23 |
ocular@bank.org |
15OCT2025 |
|
EB015 |
DonorVision |
India |
APAC |
530 |
520 |
26 |
donor@vision.org |
22NOV2025 |
|
EB016 |
LifeSight |
Kenya |
AFRICA |
250 |
240 |
34 |
sight@life.org |
11DEC2025 |
|
EB017 |
GiftVision |
India |
APAC |
0 |
230 |
36 |
gift@vision.org |
. |
|
EB018 |
HopeCornea |
USA |
NorthAmerica |
550 |
530 |
21 |
hope@cornea.org |
15JAN2025 |
|
EB019 |
EyeFuture |
India |
APAC |
460 |
455 |
24 |
eyefuture.com |
14FEB2025 |
|
EB020 |
GlobalCornea |
Singapore |
APAC |
390 |
370 |
19 |
global@cornea.org |
15MAR2025 |
|
EB021 |
BrightVision |
India |
APAC |
440 |
435 |
33 |
bright@vision.org |
01APR2025 |
|
EB022 |
ClearGift |
UK |
EUROPE |
480 |
470 |
27 |
cleargift@eye.org |
18MAY2025 |
10.R Cleaning Layer
library(tidyverse)
library(janitor)
library(lubridate)
eyebank_clean <- eyebank_raw %>%
clean_names() %>%
mutate(eye_bank_name=str_to_title(str_trim(eye_bank_name)),
country=str_to_title(country),
region_code=str_to_upper(region_code),
donors_received=abs(donors_received),
avg_wait_days=if_else(avg_wait_days<0,NA_real_,
avg_wait_days),
contact_email=str_replace_all(contact_email," ",""),
contact_email=coalesce(contact_email,"unknown@placeholder.org")
) %>%
distinct(eye_bank_id,.keep_all=TRUE)
OUTPUT:
|
eye_bank_id |
eye_bank_name |
country |
region_code |
donors_received |
transplants_done |
avg_wait_days |
contact_email |
last_audit_date |
|
EB001 |
Lionseyebank |
India |
APAC |
450 |
420 |
32 |
support@lions.org |
12JAN2025 |
|
EB002 |
Visiontrust |
Usa |
NA |
620 |
590 |
28 |
info@visiontrust.org |
15FEB2025 |
|
EB003 |
Globaleye |
Uk |
EUROPE |
45 |
310 |
20 |
admin@globaleye.org |
11MAR2025 |
|
EB004 |
Corneacare |
India |
APAC |
520 |
700 |
18 |
NULL |
. |
|
EB005 |
Brightsight |
Canada |
NA |
410 |
390 |
25 |
help@brightsight |
14APR2025 |
|
EB006 |
Futurevision |
Australia |
APAC |
300 |
280 |
22 |
futurevision.org |
30FEB2025 |
|
EB007 |
Eyelife |
Germany |
EU |
290 |
275 |
contact@eyelife.org |
19MAY2025 |
|
|
EB008 |
Worldcornea |
India |
ASIAPACIFIC |
610 |
605 |
31 |
world@cornea.org |
. |
|
EB009 |
Clearview |
Japan |
APAC |
510 |
490 |
27 |
clear@view.org |
10JUN2025 |
|
EB010 |
Visiontrust |
Usa |
NA |
620 |
590 |
28 |
info@visiontrust.org |
15FEB2025 |
|
EB011 |
Eye Hope |
Brazil |
LATAM |
400 |
410 |
30 |
hope@eye.org |
01JUL2025 |
|
EB012 |
Retinagift |
India |
APAC |
340 |
40 |
retina@gift.org |
20AUG2025 |
|
|
EB013 |
Newlight |
Uae |
MEA |
370 |
355 |
29 |
newlight@mail.com |
15SEP2025 |
|
EB014 |
Ocularbank |
France |
EU |
280 |
270 |
23 |
ocular@bank.org |
15OCT2025 |
|
EB015 |
Donorvision |
India |
APAC |
530 |
520 |
26 |
donor@vision.org |
22NOV2025 |
|
EB016 |
Lifesight |
Kenya |
AFRICA |
250 |
240 |
34 |
sight@life.org |
11DEC2025 |
|
EB017 |
Giftvision |
India |
APAC |
0 |
230 |
36 |
gift@vision.org |
. |
|
EB018 |
Hopecornea |
Usa |
NORTHAMERICA |
550 |
530 |
21 |
hope@cornea.org |
15JAN2025 |
|
EB019 |
Eyefuture |
India |
APAC |
460 |
455 |
24 |
eyefuture.com |
14FEB2025 |
|
EB020 |
Globalcornea |
Singapore |
APAC |
390 |
370 |
19 |
global@cornea.org |
15MAR2025 |
|
EB021 |
Brightvision |
India |
APAC |
440 |
435 |
33 |
bright@vision.org |
01APR2025 |
|
EB022 |
Cleargift |
Uk |
EUROPE |
480 |
470 |
27 |
cleargift@eye.org |
18MAY2025 |
Explanation
The
tidyverse workflow mirrors SAS logic almost line for line. mutate() behaves
similarly to DATA step assignments while case_when() resembles SELECT-WHEN
blocks. distinct() mirrors PROC SORT NODUPKEY behavior and coalesce() provides
elegant missing value replacement.
Enterprise Validation and
Compliance
Clinical
data cleaning extends beyond aesthetics.
It
supports:
- SDTM traceability
- ADaM reproducibility
- Audit trail requirements
- QC independence
- Regulatory transparency
- Submission readiness
One
dangerous SAS behavior involves missing numeric values.
In SAS:
if age<18 then pediatric='Y';
Missing
ages satisfy this condition because missing numeric values are treated as
smaller than valid numbers.
This
subtle behavior has caused real production errors.
Business Logic Behind
Cleaning Decisions
Data
cleaning is never arbitrary.
Negative
donor counts become absolute values because donor registrations cannot be
negative in reality. Missing audit dates are imputed using operational
assumptions so monitoring reports continue functioning. Region codes are
standardized because APAC and AsiaPacific represent identical business concepts
but fragment aggregation logic.
Suppose
one eye bank reports donor count as -450 due to accidental keyboard entry.
Leaving this value unchanged could incorrectly reduce global donor estimates by
hundreds of procedures.
Similarly,
missing dates can break time-series analysis and trend forecasting.
Text
normalization improves grouping consistency.
Email
validation supports operational communication workflows.
Healthcare
analytics depends on semantic consistency as much as statistical correctness.
Twenty Data Cleaning Best
Practices
- Preserve raw data
permanently.
- Never overwrite source
files.
- Validate identifiers first.
- Standardize metadata.
- Maintain audit logs.
- Document assumptions.
- Version control macros.
- Separate QC teams.
- Validate dates early.
- Normalize text variables.
- Track imputation rules.
- Use reusable functions.
- Automate validation reports.
- Compare frequencies before
deployment.
- Protect lineage information.
- Monitor truncation risk.
- Standardize missing values.
- Perform peer review.
- Validate business rules.
- Maintain reproducible
workflows.
Twenty One-Line Insights
- Dirty data creates expensive
business mistakes.
- Validation logic beats
visual inspection.
- Metadata drives
reproducibility.
- Missing values deserve
business context.
- Duplicates distort reality.
- Standardization improves
trust.
- Defensive programming saves
projects.
- Clean inputs create reliable
AI.
- Audits reward traceability.
- Governance reduces
surprises.
- Business rules matter.
- Automation reduces human
error.
- Documentation improves
continuity.
- SQL is powerful but not
magical.
- DATA step remains
irreplaceable.
- Reusable macros reduce risk.
- Consistency improves
analytics.
- Data lineage protects
organizations.
- Regulatory trust starts with
quality.
- Reliable decisions require
reliable data.
SAS Versus R for Enterprise
Cleaning
SAS
provides unmatched auditability, metadata management, and production deployment
reliability. DATA step programming remains exceptionally efficient for row-wise
transformations and longitudinal clinical processing.
R
provides flexibility, open-source innovation, and advanced string manipulation
capabilities. Tidyverse workflows are highly expressive and often easier for
exploratory analytics.
Large
pharmaceutical organizations continue using SAS because regulatory ecosystems
depend heavily on traceability and reproducibility.
Technology
companies often prefer R because of integration with machine learning
ecosystems.
The
strongest analytical teams use both.
SAS
handles governance.
R handles
flexibility.
Together
they create robust analytical intelligence.
Conclusion
Modern
analytics systems depend less on algorithms and more on trustworthy input data.
An
advanced predictive model trained on corrupted information simply produces
sophisticated errors.
Eye bank
operations illustrate this challenge clearly.
Duplicate
identifiers distort operational capacity estimates. Missing audit dates
interrupt compliance monitoring. Invalid email structures disrupt communication
systems. Inconsistent regional classifications create misleading dashboards.
Data
cleaning therefore becomes a strategic capability rather than an operational
inconvenience.
SAS
contributes industrial-grade governance, reproducibility, and audit readiness.
R
contributes agility, innovation, and exploratory flexibility.
Organizations
that combine both ecosystems gain significant competitive advantages in
healthcare analytics, insurance risk modeling, financial fraud detection, and
retail intelligence.
The
future of analytics does not belong to the organization with the largest data
warehouse.
It belongs
to the organization with the most trustworthy data.
Because
in modern analytics, vision alone is not enough.
Validated
vision changes decisions.
Validated
decisions change outcomes.
Interview Questions and
Answers
1. How would you identify duplicate records in SAS?
Use PROC
SORT with NODUPKEY or PROC SQL GROUP BY with HAVING COUNT(*)>1 depending on
reporting requirements.
2. Why is LENGTH placement important in SAS?
LENGTH
statements must appear before assignments because variable storage allocation
occurs during compilation and late declarations create truncation risk.
3. What is the R equivalent of PROC SORT NODUPKEY?
The
equivalent approach is distinct() from dplyr.
4. Why are missing values dangerous in SAS?
Missing
numeric values are treated as smaller than valid numbers, causing unexpected IF
condition behavior.
5. When would you choose PROC SQL over DATA Step?
PROC SQL
performs better for complex joins and aggregations while DATA step excels at
sequential processing and FIRST./LAST. logic.
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
About the Author:
About the Author:
SAS Learning Hub is a data analytics and SAS programming platform focused on clinical, financial, and real-world data analysis. The content is created by professionals with academic training in Pharmaceutics and hands-on experience in Base SAS, PROC SQL, Macros, SDTM, and ADaM, providing practical and industry-relevant SAS learning resources.
Disclaimer:
The datasets and analysis in this article are created for educational and demonstration purposes only. Here we learn about EYE BANK DATA.
Our Mission:
This blog provides industry-focused SAS programming tutorials and analytics projects covering finance, healthcare, and technology.
This project is suitable for:
· Students learning SAS
· Data analysts building portfolios
· Professionals preparing for SAS interviews
· Bloggers writing about analytics and smart cities
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Comments
Post a Comment