387.Can SAS Predict Railway Signal Failures Before Disaster Strikes?

Can SAS Predict Railway Signal Failures Before Disaster Strikes?

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

HERE IN THIS PROJECT WE USED THESE SAS STATEMENTS —
DATA STEP | PROC SQL |  PROC PRINT | PROC SGPLOT | MACROS | PROC CORR | PROC MEANS | PROC FREQ | PROC UNIVARIATE | APPEND | PROC DATASETS DELETE | DATA FUNCTIONS

INTRODUCTION

Railway signals are one of the most critical safety systems in any railway network.
Every green or red signal decides whether a train moves or stops. A single wrong signal can lead to massive delays, accidents, or even loss of lives.

But in real life, railway departments manage thousands of signal systems across multiple routes. It is impossible to manually monitor:

·       Which signals fail frequently

·       Which routes are risky

·       Which systems cost too much but fail too often

·       And which maintenance records look suspicious

So the real question becomes:

Can data analytics help railway authorities predict failures, improve safety, and even detect fraud?

This project answers that question using SAS analytics.
We create a realistic railway signal dataset and apply:

·       Statistical analysis

·       SQL reporting

·       Visual dashboards

·       Automation macros

·       Fraud detection logic


BUSINESS CONTEXT

Railway signaling systems are the backbone of railway safety.
A single signal failure can cause:

  • Train delays
  • Collisions
  • Passenger safety risk
  • Huge financial losses

Railway authorities maintain thousands of signal systems across routes.
They track:

  • How often signals fail
  • How long repairs take
  • Which routes are most risky
  • Which systems show suspicious patterns (possible fraud / negligence)

This project simulates how a Railway Operations Analytics Team uses SAS to:

  1. Monitor system reliability
  2. Identify high-risk signals
  3. Detect abnormal maintenance behavior
  4. Generate safety dashboards

TABLE OF CONTENTS

  1. Dataset Creation
  2. Data Engineering (dates, functions)
  3. PROC SQL Analysis
  4. PROC MEANS
  5. PROC UNIVARIATE
  6. PROC FREQ
  7. PROC CORR
  8. PROC SGPLOT
  9. Macros
  10. Fraud Detection Logic
  11. SET / MERGE / APPEND / TRANSPOSE
  12. Character & Numeric Functions
  13. PROC DATASETS DELETE

1. DATASET CREATION

data railway_signals;

    input Signal_ID $ Route:$15. Failure_Count Downtime_Minutes 

          Safety_Impact $ Maintenance_Cost Reliability_Index 

          Install_Date : date9. Last_Service_Date : date9.;

    format Install_Date Last_Service_Date date9.;

datalines;

SIG001 SouthLine 5 180 High 12000 82 01JAN2020 10DEC2023

SIG002 EastLine 2 60 Low 4000 95 15MAR2019 05NOV2023

SIG003 NorthLine 9 300 Critical 20000 65 10FEB2018 20OCT2023

SIG004 WestLine 1 20 Low 2000 98 05MAY2021 01JAN2024

SIG005 SouthLine 7 240 High 15000 70 12JUL2017 15SEP2023

SIG006 EastLine 4 120 Medium 8000 85 09AUG2019 30NOV2023

SIG007 NorthLine 10 360 Critical 25000 60 01JAN2016 05OCT2023

SIG008 WestLine 3 90 Medium 6000 88 18DEC2020 12DEC2023

SIG009 CentralLine 6 210 High 13000 75 02FEB2018 22NOV2023

SIG010 CentralLine 1 30 Low 3000 97 15JUN2022 05JAN2024

SIG011 MetroLine 8 260 High 17000 68 20MAR2017 18SEP2023

SIG012 MetroLine 2 70 Medium 5000 90 10JAN2021 02DEC2023

SIG013 FreightLine 11 400 Critical 30000 55 01JAN2015 15OCT2023

SIG014 FreightLine 3 100 Medium 7000 86 05MAY2020 01DEC2023

SIG015 ExpressLine 4 110 Medium 9000 83 12APR2019 28NOV2023

;

run;

proc print data=railway_signals;

run;

OUTPUT:

ObsSignal_IDRouteFailure_CountDowntime_MinutesSafety_ImpactMaintenance_CostReliability_IndexInstall_DateLast_Service_Date
1SIG001SouthLine5180High120008201JAN202010DEC2023
2SIG002EastLine260Low40009515MAR201905NOV2023
3SIG003NorthLine9300Critical200006510FEB201820OCT2023
4SIG004WestLine120Low20009805MAY202101JAN2024
5SIG005SouthLine7240High150007012JUL201715SEP2023
6SIG006EastLine4120Medium80008509AUG201930NOV2023
7SIG007NorthLine10360Critical250006001JAN201605OCT2023
8SIG008WestLine390Medium60008818DEC202012DEC2023
9SIG009CentralLine6210High130007502FEB201822NOV2023
10SIG010CentralLine130Low30009715JUN202205JAN2024
11SIG011MetroLine8260High170006820MAR201718SEP2023
12SIG012MetroLine270Medium50009010JAN202102DEC2023
13SIG013FreightLine11400Critical300005501JAN201515OCT2023
14SIG014FreightLine3100Medium70008605MAY202001DEC2023
15SIG015ExpressLine4110Medium90008312APR201928NOV2023

Used for raw data ingestion from operational systems.


2. DATE ENGINEERING (MDY, INTCK, INTNX)

data railway_dates;

    set railway_signals;

    Years_In_Service = intck('year', Install_Date, today());

    Next_Service_Date = intnx('month', Last_Service_Date, 6);

run;

proc print data=railway_dates;

 var Signal_ID Route Install_Date Last_Service_Date Years_In_Service Next_Service_Date;

run;

OUTPUT:

ObsSignal_IDRouteInstall_DateLast_Service_DateYears_In_ServiceNext_Service_Date
1SIG001SouthLine01JAN202010DEC2023623528
2SIG002EastLine15MAR201905NOV2023723497
3SIG003NorthLine10FEB201820OCT2023823467
4SIG004WestLine05MAY202101JAN2024523558
5SIG005SouthLine12JUL201715SEP2023923436
6SIG006EastLine09AUG201930NOV2023723497
7SIG007NorthLine01JAN201605OCT20231023467
8SIG008WestLine18DEC202012DEC2023623528
9SIG009CentralLine02FEB201822NOV2023823497
10SIG010CentralLine15JUN202205JAN2024423558
11SIG011MetroLine20MAR201718SEP2023923436
12SIG012MetroLine10JAN202102DEC2023523528
13SIG013FreightLine01JAN201515OCT20231123467
14SIG014FreightLine05MAY202001DEC2023623528
15SIG015ExpressLine12APR201928NOV2023723497

Used in real projects to:

·       Calculate service life

·       Predict next maintenance

·       Build SLA reports


3. PROC SQL

proc sql;

    create table route_summary as

    select Route,

           count(Signal_ID) as Total_Signals,

           avg(Failure_Count) as Avg_Failures,

           sum(Maintenance_Cost) as Total_Cost

    from railway_dates

    group by Route;

quit;

proc print data=route_summary;

 var Route Total_Signals Avg_Failures Total_Cost;

run;

OUTPUT:

ObsRouteTotal_SignalsAvg_FailuresTotal_Cost
1CentralLine23.516000
2EastLine23.012000
3ExpressLine14.09000
4FreightLine27.037000
5MetroLine25.022000
6NorthLine29.545000
7SouthLine26.027000
8WestLine22.08000

Used for:

·       Aggregations

·       Reporting dashboards

·       KPI summaries


4. PROC MEANS

proc means data=railway_dates mean min max;

    var Failure_Count Downtime_Minutes Maintenance_Cost Reliability_Index;

run;

OUTPUT:

The MEANS Procedure

VariableMeanMinimumMaximum
Failure_Count
Downtime_Minutes
Maintenance_Cost
Reliability_Index
5.0666667
170.0000000
11733.33
79.8000000
1.0000000
20.0000000
2000.00
55.0000000
11.0000000
400.0000000
30000.00
98.0000000

Used by managers to:

·       Understand average failures

·       Identify worst-performing systems

5. PROC UNIVARIATE

proc univariate data=railway_dates;

    var Downtime_Minutes;

    histogram Downtime_Minutes;

run;

OUTPUT:

The UNIVARIATE Procedure

Variable: Downtime_Minutes

Moments
N15Sum Weights15
Mean170Sum Observations2550
Std Deviation120.178439Variance14442.8571
Skewness0.60370171Kurtosis-0.7662038
Uncorrected SS635700Corrected SS202200
Coeff Variation70.6931993Std Error Mean31.0299395
Basic Statistical Measures
LocationVariability
Mean170.0000Std Deviation120.17844
Median120.0000Variance14443
Mode.Range380.00000
  Interquartile Range190.00000
Tests for Location: Mu0=0
TestStatisticp Value
Student's tt5.47858Pr > |t|<.0001
SignM7.5Pr >= |M|<.0001
Signed RankS60Pr >= |S|<.0001
Quantiles (Definition 5)
LevelQuantile
100% Max400
99%400
95%400
90%360
75% Q3260
50% Median120
25% Q170
10%30
5%20
1%20
0% Min20
Extreme Observations
LowestHighest
ValueObsValueObs
2042405
301026011
6023003
70123607
90840013

The UNIVARIATE Procedure

Histogram for Downtime_Minutes

Used by QA teams to:

·       Detect extreme downtime

·       Identify outliers

6. PROC FREQ

proc freq data=railway_dates;

    tables Safety_Impact Route;

run;

OUTPUT:

The FREQ Procedure

Safety_ImpactFrequencyPercentCumulative
Frequency
Cumulative
Percent
Critical320.00320.00
High426.67746.67
Low320.001066.67
Medium533.3315100.00
RouteFrequencyPercentCumulative
Frequency
Cumulative
Percent
CentralLine213.33213.33
EastLine213.33426.67
ExpressLine16.67533.33
FreightLine213.33746.67
MetroLine213.33960.00
NorthLine213.331173.33
SouthLine213.331386.67
WestLine213.3315100.00

Used to:

·       Count safety risks

·       Compare routes

7. PROC CORR

proc corr data=railway_dates;

    var Failure_Count Downtime_Minutes Maintenance_Cost Reliability_Index;

run;

OUTPUT:

The CORR Procedure

4 Variables:Failure_Count Downtime_Minutes Maintenance_Cost Reliability_Index
Simple Statistics
VariableNMeanStd DevSumMinimumMaximum
Failure_Count155.066673.2834476.000001.0000011.00000
Downtime_Minutes15170.00000120.17844255020.00000400.00000
Maintenance_Cost15117338328176000200030000
Reliability_Index1579.8000013.61302119755.0000098.00000
Pearson Correlation Coefficients, N = 15
Prob > |r| under H0: Rho=0
 Failure_CountDowntime_MinutesMaintenance_CostReliability_Index
Failure_Count
1.00000
 
0.99559
<.0001
0.98551
<.0001
-0.99366
<.0001
Downtime_Minutes
0.99559
<.0001
1.00000
 
0.99204
<.0001
-0.99023
<.0001
Maintenance_Cost
0.98551
<.0001
0.99204
<.0001
1.00000
 
-0.98026
<.0001
Reliability_Index
-0.99366
<.0001
-0.99023
<.0001
-0.98026
<.0001
1.00000
 

Used to:

·       Check if more failures → lower reliability

·       Predict future risk

8. PROC SGPLOT

proc sgplot data=railway_dates;

    scatter x=Failure_Count y=Reliability_Index;

run;

OUTPUT:

The SGPlot Procedure

Used for:

·       Executive dashboards

·       Visual inspection


9. UTILIZATION MACRO

%macro classify;

data utilization;

    set railway_dates;

    if Failure_Count > 7 then Utilization = "Overloaded";

    else if 4 < Failure_Count < 7 then Utilization = "Moderate";

    else Utilization = "Stable";

run;

proc print data=utilization;

run;

%mend;


%classify;

OUTPUT:

ObsSignal_IDRouteFailure_CountDowntime_MinutesSafety_ImpactMaintenance_CostReliability_IndexInstall_DateLast_Service_DateYears_In_ServiceNext_Service_DateUtilization
1SIG001SouthLine5180High120008201JAN202010DEC2023623528Moderate
2SIG002EastLine260Low40009515MAR201905NOV2023723497Stable
3SIG003NorthLine9300Critical200006510FEB201820OCT2023823467Overloaded
4SIG004WestLine120Low20009805MAY202101JAN2024523558Stable
5SIG005SouthLine7240High150007012JUL201715SEP2023923436Stable
6SIG006EastLine4120Medium80008509AUG201930NOV2023723497Stable
7SIG007NorthLine10360Critical250006001JAN201605OCT20231023467Overloaded
8SIG008WestLine390Medium60008818DEC202012DEC2023623528Stable
9SIG009CentralLine6210High130007502FEB201822NOV2023823497Moderate
10SIG010CentralLine130Low30009715JUN202205JAN2024423558Stable
11SIG011MetroLine8260High170006820MAR201718SEP2023923436Overloaded
12SIG012MetroLine270Medium50009010JAN202102DEC2023523528Stable
13SIG013FreightLine11400Critical300005501JAN201515OCT20231123467Overloaded
14SIG014FreightLine3100Medium70008605MAY202001DEC2023623528Stable
15SIG015ExpressLine4110Medium90008312APR201928NOV2023723497Stable

Used in industry for:

·       Reusable logic

·       Automation

·       Batch processing


10. FRAUD DETECTION MACRO

%macro fraud;

data fraud_flags;

    set utilization;

    if Maintenance_Cost > 20000 and Failure_Count < 3 then Fraud_Flag = "Yes";

    else Fraud_Flag = "No";

run;

proc print data=fraud_flags;

run;

%mend;


%fraud;

OUTPUT:

ObsSignal_IDRouteFailure_CountDowntime_MinutesSafety_ImpactMaintenance_CostReliability_IndexInstall_DateLast_Service_DateYears_In_ServiceNext_Service_DateUtilizationFraud_Flag
1SIG001SouthLine5180High120008201JAN202010DEC2023623528ModerateNo
2SIG002EastLine260Low40009515MAR201905NOV2023723497StableNo
3SIG003NorthLine9300Critical200006510FEB201820OCT2023823467OverloadedNo
4SIG004WestLine120Low20009805MAY202101JAN2024523558StableNo
5SIG005SouthLine7240High150007012JUL201715SEP2023923436StableNo
6SIG006EastLine4120Medium80008509AUG201930NOV2023723497StableNo
7SIG007NorthLine10360Critical250006001JAN201605OCT20231023467OverloadedNo
8SIG008WestLine390Medium60008818DEC202012DEC2023623528StableNo
9SIG009CentralLine6210High130007502FEB201822NOV2023823497ModerateNo
10SIG010CentralLine130Low30009715JUN202205JAN2024423558StableNo
11SIG011MetroLine8260High170006820MAR201718SEP2023923436OverloadedNo
12SIG012MetroLine270Medium50009010JAN202102DEC2023523528StableNo
13SIG013FreightLine11400Critical300005501JAN201515OCT20231123467OverloadedNo
14SIG014FreightLine3100Medium70008605MAY202001DEC2023623528StableNo
15SIG015ExpressLine4110Medium90008312APR201928NOV2023723497StableNo

Very common in:

·       Finance

·       Maintenance

·       Insurance analytics

Flags cases like:

Low failures but very high cost → possible fake billing.


11. SET / APPEND

data backup;

    set fraud_flags;

run;

proc print data=backup;

run;

OUTPUT:

ObsSignal_IDRouteFailure_CountDowntime_MinutesSafety_ImpactMaintenance_CostReliability_IndexInstall_DateLast_Service_DateYears_In_ServiceNext_Service_DateUtilizationFraud_Flag
1SIG001SouthLine5180High120008201JAN202010DEC2023623528ModerateNo
2SIG002EastLine260Low40009515MAR201905NOV2023723497StableNo
3SIG003NorthLine9300Critical200006510FEB201820OCT2023823467OverloadedNo
4SIG004WestLine120Low20009805MAY202101JAN2024523558StableNo
5SIG005SouthLine7240High150007012JUL201715SEP2023923436StableNo
6SIG006EastLine4120Medium80008509AUG201930NOV2023723497StableNo
7SIG007NorthLine10360Critical250006001JAN201605OCT20231023467OverloadedNo
8SIG008WestLine390Medium60008818DEC202012DEC2023623528StableNo
9SIG009CentralLine6210High130007502FEB201822NOV2023823497ModerateNo
10SIG010CentralLine130Low30009715JUN202205JAN2024423558StableNo
11SIG011MetroLine8260High170006820MAR201718SEP2023923436OverloadedNo
12SIG012MetroLine270Medium50009010JAN202102DEC2023523528StableNo
13SIG013FreightLine11400Critical300005501JAN201515OCT20231123467OverloadedNo
14SIG014FreightLine3100Medium70008605MAY202001DEC2023623528StableNo
15SIG015ExpressLine4110Medium90008312APR201928NOV2023723497StableNo

proc append base=railway_dates 

            data=backup force;

run;

proc print data=railway_dates;

run;

OUTPUT:

ObsSignal_IDRouteFailure_CountDowntime_MinutesSafety_ImpactMaintenance_CostReliability_IndexInstall_DateLast_Service_DateYears_In_ServiceNext_Service_Date
1SIG001SouthLine5180High120008201JAN202010DEC2023623528
2SIG002EastLine260Low40009515MAR201905NOV2023723497
3SIG003NorthLine9300Critical200006510FEB201820OCT2023823467
4SIG004WestLine120Low20009805MAY202101JAN2024523558
5SIG005SouthLine7240High150007012JUL201715SEP2023923436
6SIG006EastLine4120Medium80008509AUG201930NOV2023723497
7SIG007NorthLine10360Critical250006001JAN201605OCT20231023467
8SIG008WestLine390Medium60008818DEC202012DEC2023623528
9SIG009CentralLine6210High130007502FEB201822NOV2023823497
10SIG010CentralLine130Low30009715JUN202205JAN2024423558
11SIG011MetroLine8260High170006820MAR201718SEP2023923436
12SIG012MetroLine270Medium50009010JAN202102DEC2023523528
13SIG013FreightLine11400Critical300005501JAN201515OCT20231123467
14SIG014FreightLine3100Medium70008605MAY202001DEC2023623528
15SIG015ExpressLine4110Medium90008312APR201928NOV2023723497
16SIG001SouthLine5180High120008201JAN202010DEC2023623528
17SIG002EastLine260Low40009515MAR201905NOV2023723497
18SIG003NorthLine9300Critical200006510FEB201820OCT2023823467
19SIG004WestLine120Low20009805MAY202101JAN2024523558
20SIG005SouthLine7240High150007012JUL201715SEP2023923436
21SIG006EastLine4120Medium80008509AUG201930NOV2023723497
22SIG007NorthLine10360Critical250006001JAN201605OCT20231023467
23SIG008WestLine390Medium60008818DEC202012DEC2023623528
24SIG009CentralLine6210High130007502FEB201822NOV2023823497
25SIG010CentralLine130Low30009715JUN202205JAN2024423558
26SIG011MetroLine8260High170006820MAR201718SEP2023923436
27SIG012MetroLine270Medium50009010JAN202102DEC2023523528
28SIG013FreightLine11400Critical300005501JAN201515OCT20231123467
29SIG014FreightLine3100Medium70008605MAY202001DEC2023623528
30SIG015ExpressLine4110Medium90008312APR201928NOV2023723497

Used for:

·       Combining months

·       Merging systems

·       Creating history tables


12. TRANSPOSE

proc transpose data=route_summary out=transposed;

    by Route;

run;

proc print data=transposed;

run;

OUTPUT:

ObsRoute_NAME_COL1
1CentralLineTotal_Signals2.0
2CentralLineAvg_Failures3.5
3CentralLineTotal_Cost16000.0
4EastLineTotal_Signals2.0
5EastLineAvg_Failures3.0
6EastLineTotal_Cost12000.0
7ExpressLineTotal_Signals1.0
8ExpressLineAvg_Failures4.0
9ExpressLineTotal_Cost9000.0
10FreightLineTotal_Signals2.0
11FreightLineAvg_Failures7.0
12FreightLineTotal_Cost37000.0
13MetroLineTotal_Signals2.0
14MetroLineAvg_Failures5.0
15MetroLineTotal_Cost22000.0
16NorthLineTotal_Signals2.0
17NorthLineAvg_Failures9.5
18NorthLineTotal_Cost45000.0
19SouthLineTotal_Signals2.0
20SouthLineAvg_Failures6.0
21SouthLineTotal_Cost27000.0
22WestLineTotal_Signals2.0
23WestLineAvg_Failures2.0
24WestLineTotal_Cost8000.0

Used for:

·       Pivot reports

·       Excel-style layouts


13. CHARACTER FUNCTIONS

data char_demo;

    set railway_dates;

    Route_Clean = propcase(strip(Route));

    Signal_Upper = upcase(Signal_ID);

    Safety_Lower = lowcase(Safety_Impact);

    Full_Tag = catx("-", Signal_ID, Route);

run;

proc print data=char_demo;

 var Signal_ID Route Route_Clean Signal_Upper Safety_Lower Safety_Impact Full_Tag;

run;

OUTPUT:

ObsSignal_IDRouteRoute_CleanSignal_UpperSafety_LowerSafety_ImpactFull_Tag
1SIG001SouthLineSouthlineSIG001highHighSIG001-SouthLine
2SIG002EastLineEastlineSIG002lowLowSIG002-EastLine
3SIG003NorthLineNorthlineSIG003criticalCriticalSIG003-NorthLine
4SIG004WestLineWestlineSIG004lowLowSIG004-WestLine
5SIG005SouthLineSouthlineSIG005highHighSIG005-SouthLine
6SIG006EastLineEastlineSIG006mediumMediumSIG006-EastLine
7SIG007NorthLineNorthlineSIG007criticalCriticalSIG007-NorthLine
8SIG008WestLineWestlineSIG008mediumMediumSIG008-WestLine
9SIG009CentralLineCentrallineSIG009highHighSIG009-CentralLine
10SIG010CentralLineCentrallineSIG010lowLowSIG010-CentralLine
11SIG011MetroLineMetrolineSIG011highHighSIG011-MetroLine
12SIG012MetroLineMetrolineSIG012mediumMediumSIG012-MetroLine
13SIG013FreightLineFreightlineSIG013criticalCriticalSIG013-FreightLine
14SIG014FreightLineFreightlineSIG014mediumMediumSIG014-FreightLine
15SIG015ExpressLineExpresslineSIG015mediumMediumSIG015-ExpressLine
16SIG001SouthLineSouthlineSIG001highHighSIG001-SouthLine
17SIG002EastLineEastlineSIG002lowLowSIG002-EastLine
18SIG003NorthLineNorthlineSIG003criticalCriticalSIG003-NorthLine
19SIG004WestLineWestlineSIG004lowLowSIG004-WestLine
20SIG005SouthLineSouthlineSIG005highHighSIG005-SouthLine
21SIG006EastLineEastlineSIG006mediumMediumSIG006-EastLine
22SIG007NorthLineNorthlineSIG007criticalCriticalSIG007-NorthLine
23SIG008WestLineWestlineSIG008mediumMediumSIG008-WestLine
24SIG009CentralLineCentrallineSIG009highHighSIG009-CentralLine
25SIG010CentralLineCentrallineSIG010lowLowSIG010-CentralLine
26SIG011MetroLineMetrolineSIG011highHighSIG011-MetroLine
27SIG012MetroLineMetrolineSIG012mediumMediumSIG012-MetroLine
28SIG013FreightLineFreightlineSIG013criticalCriticalSIG013-FreightLine
29SIG014FreightLineFreightlineSIG014mediumMediumSIG014-FreightLine
30SIG015ExpressLineExpresslineSIG015mediumMediumSIG015-ExpressLine

Used for:

·       Cleaning messy data

·       Standardizing names

·       Generating IDs


14. NUMERIC FUNCTIONS

data numeric_demo;

    set railway_dates;

    Cost_Rounded = round(Maintenance_Cost,1000);

    Downtime_Hours = Downtime_Minutes/60;

    Safe_Reliability = coalesce(Reliability_Index, 0);

run;

proc print data=numeric_demo;

 var Signal_ID Route Maintenance_Cost Cost_Rounded Downtime_Minutes Downtime_Hours 

     Reliability_Index Safe_Reliability;

run;

OUTPUT:

ObsSignal_IDRouteMaintenance_CostCost_RoundedDowntime_MinutesDowntime_HoursReliability_IndexSafe_Reliability
1SIG001SouthLine12000120001803.000008282
2SIG002EastLine40004000601.000009595
3SIG003NorthLine20000200003005.000006565
4SIG004WestLine20002000200.333339898
5SIG005SouthLine15000150002404.000007070
6SIG006EastLine800080001202.000008585
7SIG007NorthLine25000250003606.000006060
8SIG008WestLine60006000901.500008888
9SIG009CentralLine13000130002103.500007575
10SIG010CentralLine30003000300.500009797
11SIG011MetroLine17000170002604.333336868
12SIG012MetroLine50005000701.166679090
13SIG013FreightLine30000300004006.666675555
14SIG014FreightLine700070001001.666678686
15SIG015ExpressLine900090001101.833338383
16SIG001SouthLine12000120001803.000008282
17SIG002EastLine40004000601.000009595
18SIG003NorthLine20000200003005.000006565
19SIG004WestLine20002000200.333339898
20SIG005SouthLine15000150002404.000007070
21SIG006EastLine800080001202.000008585
22SIG007NorthLine25000250003606.000006060
23SIG008WestLine60006000901.500008888
24SIG009CentralLine13000130002103.500007575
25SIG010CentralLine30003000300.500009797
26SIG011MetroLine17000170002604.333336868
27SIG012MetroLine50005000701.166679090
28SIG013FreightLine30000300004006.666675555
29SIG014FreightLine700070001001.666678686
30SIG015ExpressLine900090001101.833338383

15. PROC DATASETS DELETE

proc datasets library=work;

    delete backup;

quit;

LOG:

NOTE: Deleting WORK.BACKUP (memtype=DATA).

Used to:

·       Free memory

·       Avoid confusion

·       Clean temporary tables


WHY THIS PROJECT IS VERY STRONG

This single project demonstrates:

·       Data creation

·       Business thinking

·       Statistical analysis

·       Automation

·       Fraud detection

·       Visualization

CONCLUSION

Through this project, we proved that railway signal data is not just numbers — it is directly connected to human safety and operational efficiency.

Using SAS, we were able to:

·       Identify high-risk signal systems

·       Measure reliability across routes

·       Visualize downtime and failure patterns

·       Automatically classify utilization levels

·       Detect suspicious maintenance behavior (fraud logic)

Instead of reacting after accidents happen, this kind of analytics allows organizations to:

Predict problems before they become disasters.

The biggest learning from this project is:

Data does not just support decisions — it can prevent real-world failures.


INTERVIEW QUESTIONS FOR YOU

 1.What happens during the compilation and execution phase of a DATA step?

 2.What is the difference between SET and MERGE?

 3. What is HAVING clause?

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

About the Author:

SAS Learning Hub is a data analytics and SAS programming platform focused on clinical, financial, and real-world data analysis. The content is created by professionals with academic training in Pharmaceutics and hands-on experience in Base SAS, PROC SQL, Macros, SDTM, and ADaM, providing practical and industry-relevant SAS learning resources.


Disclaimer:

The datasets and analysis in this article are created for educational and demonstration purposes only. They do not represent Railway Signals data.


Our Mission:

This blog provides industry-focused SAS programming tutorials and analytics projects covering finance, healthcare, and technology.


This project is suitable for:

·  Students learning SAS

·  Data analysts building portfolios

·  Professionals preparing for SAS interviews

·  Bloggers writing about analytics and smart cities

·  EV and energy industry professionals

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Follow Us On : 


 
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--->Follow our blog for more SAS-based analytics projects and industry data models.

---> Support Us By Following Our Blog..

To deepen your understanding of SAS analytics, please refer to our other data science and industry-focused projects listed below:




--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

About Us | Contact Privacy Policy

Comments

Popular posts from this blog

397.If a satellite has excellent signal strength but very high latency, can it still deliver good quality communication? Why or why not?A Sas Study

401.How Efficient Are Global Data Centers? A Complete SAS Analytics Study

383.Which Pharma Distributors Are High-Risk? Can SAS Detect Delays, Temperature Violations, and Fraud?