404.What happens if maintenance cost is accidentally stored as character data, and how does SAS expose this error?

What happens if maintenance cost is accidentally stored as character data, and how does SAS expose this error?

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

HERE WE USED THESE SAS STATEMENTS AND FUNCTIONS FOR THIS PROJECT:

DATA STEP | SET | INPUT | DATALINES | FORMAT | KEEP | IF-THEN-ELSE | LENGTH | PUT | PROC CONTENTS | PROC DATASETS | PROC APPEND | PROC SQL | PROC MEANS | PROC UNIVARIATE | PROC FREQ | PROC CORR | PROC TRANSPOSE | PROC SGPLOT | MACRO | %MACRO / %MEND | INTCK | INTNX | MDY | CAT | CATX | STRIP | TRIM | UPCASE | LOWCASE | PROPCASE | COALESCE

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

TABLE OF CONTENTS

1.     Introduction

2.     Business Context

3.     Project Objectives

4.     Dataset Design & Variables

5.     Raw Dataset Creation (with Intentional Error)

6.     Identifying & Explaining the Error

7.     Corrected Dataset – Full SAS Code

8.     Data Preparation & Formatting

9.     Date Handling (MDY, INTCK, INTNX)

10.  Character & Numeric Functions Usage

11.  Reliability & Utilization Classification (Macros)

12.  Fraud Detection Logic (Macros)

13.  Descriptive Analytics (PROC MEANS, UNIVARIATE)

14.  Frequency Analysis (PROC FREQ)

15.  Correlation Analysis (PROC CORR)

16.  SQL-Based Analytics

17.  Data Reshaping (SET, MERGE, APPEND, TRANSPOSE)

18.  Visualization (PROC SGPLOT)

19.  Dataset Cleanup (PROC DATASETS DELETE)

20.  Business Insights

21.  21 Key Points About This Project

22.  Conclusion

1.INTRODUCTION

Lighthouses are critical maritime safety infrastructure.
They guide ships, prevent accidents, and ensure smooth navigation during storms, fog, and night operations.

In this project, we treat lighthouses like operational assets, similar to data centers, telecom towers, or hospitals.
We analyze:

·       Reliability

·       Maintenance costs

·       Failure incidents

·       Power usage

·       Fraud or abnormal behavior

·       Operational efficiency

2. BUSINESS CONTEXT

Maritime authorities face challenges like:

·       High maintenance expenses

·       Power failures

·       Equipment aging

·       False maintenance claims

·       Inconsistent reliability

Using SAS, we can:

·       Detect risky lighthouses

·       Identify fraud-prone maintenance records

·       Optimize maintenance planning

·       Improve safety compliance

3. PROJECT OBJECTIVES

1. Create a lighthouse dataset (15+ observations)
2. Introduce intentional data error
3. Detect and fix the error
4. Perform full analytics using SAS
5. Use macros, SQL, dates, functions
6. Generate visual insights
7. Prepare interview-grade explanations

4. DATASET DESIGN

Variables Used

Variable

Description

Lighthouse_ID

Unique ID

Lighthouse_Name

Name

Coastline

East / West / South

Visibility_Range

Nautical miles

Maintenance_Cost

Annual cost

Failure_Incidents

Yearly failures

Power_Source

Solar / Electric / Hybrid

Installation_Date

Commission date

Last_Service_Date

Last maintenance

Reliability_Index

Calculated percentage

Utilization_Class

High / Medium / Low

Fraud_Flag

Yes / No

5. RAW DATASET (WITH INTENTIONAL ERROR)

Intentional Error:

Maintenance_Cost entered as character ("$45000") instead of numeric

data lighthouse_raw;

    input Lighthouse_ID Lighthouse_Name:$25. Coastline:$10.

          Visibility_Range Maintenance_Cost $ Failure_Incidents

          Power_Source:$10. Installation_Date : date9.

          Last_Service_Date : date9.;

    format Installation_Date Last_Service_Date date9.;

datalines;

101 Alpha_Point East 25 $45000 2 Solar 15JAN2015 12JAN2025

102 Beacon_Rock West 30 $52000 1 Electric 10FEB2016 05JAN2025

103 Coral_Light South 18 $38000 4 Hybrid 22MAR2014 20DEC2024

;

run;

proc print data=lighthouse_raw;

run;

OUTPUT:

ObsLighthouse_IDLighthouse_NameCoastlineVisibility_RangeMaintenance_CostFailure_IncidentsPower_SourceInstallation_DateLast_Service_Date
1101Alpha_PointEast25$450002Solar15JAN201512JAN2025
2102Beacon_RockWest30$520001Electric10FEB201605JAN2025
3103Coral_LightSouth18$380004Hybrid22MAR201420DEC2024

6.IDENTIFYING THE ERROR

·  Maintenance_Cost should be numeric

·  Stored as character

·  Causes:

·       PROC MEANS failure

·       Incorrect calculations

·       SQL aggregation errors

Detection

proc contents data=lighthouse_raw;

run;

OUTPUT:

The CONTENTS Procedure

Data Set NameWORK.LIGHTHOUSE_RAWObservations3
Member TypeDATAVariables9
EngineV9Indexes0
Created02/17/2026 20:00:36Observation Length96
Last Modified02/17/2026 20:00:36Deleted Observations0
Protection CompressedNO
Data Set Type SortedNO
Label   
Data RepresentationSOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LINUX_IA64  
Encodingutf-8 Unicode (UTF-8)  
Engine/Host Dependent Information
Data Set Page Size131072
Number of Data Set Pages1
First Data Page1
Max Obs per Page1363
Obs in First Data Page3
Number of Data Set Repairs0
Filename/saswork/SAS_workFFCC00019F1A_odaws02-apse1-2.oda.sas.com/SAS_work313F00019F1A_odaws02-apse1-2.oda.sas.com/lighthouse_raw.sas7bdat
Release Created9.0401M8
Host CreatedLinux
Inode Number67111648
Access Permissionrw-r--r--
Owner Nameu63247146
File Size256KB
File Size (bytes)262144
Alphabetic List of Variables and Attributes
#VariableTypeLenFormat
3CoastlineChar10 
6Failure_IncidentsNum8 
8Installation_DateNum8DATE9.
9Last_Service_DateNum8DATE9.
1Lighthouse_IDNum8 
2Lighthouse_NameChar25 
5Maintenance_CostChar8 
7Power_SourceChar10 
4Visibility_RangeNum8 
7. CORRECTED DATASET (FULL & FINAL)

data lighthouse;

    input Lighthouse_ID Lighthouse_Name:$25. Coastline:$10.

          Visibility_Range Maintenance_Cost Failure_Incidents

          Power_Source:$10. Installation_Date : date9.

          Last_Service_Date : date9.;

    format Installation_Date Last_Service_Date date9.;

datalines;

101 Alpha_Point East 25 45000 2 Solar 15JAN2015 12JAN2025

102 Beacon_Rock West 30 52000 1 Electric 10FEB2016 05JAN2025

103 Coral_Light South 18 38000 4 Hybrid 22MAR2014 20DEC2024

104 Delta_Beam East 22 41000 3 Solar 11APR2017 18JAN2025

105 Eagle_Eye West 35 61000 0 Electric 09MAY2018 01JAN2025

106 Falcon_Tower South 20 39500 2 Hybrid 15JUN2016 10DEC2024

107 Gulf_Guide East 28 47000 1 Solar 12JUL2019 02JAN2025

108 Harbor_Light West 32 58000 0 Electric 18AUG2020 05JAN2025

109 Island_Beam South 16 36000 5 Hybrid 21SEP2013 22DEC2024

110 Jetty_Light East 26 45500 2 Solar 10OCT2015 12JAN2025

111 Keystone West 34 60000 1 Electric 15NOV2017 05JAN2025

112 Lagoon_Light South 19 39000 3 Hybrid 01DEC2016 20DEC2024

113 Marina_Beam East 27 46500 1 Solar 25JAN2019 02JAN2025

114 Nautical_Point West 33 59000 0 Electric 11FEB2021 05JAN2025

115 Ocean_Guard South 21 40500 4 Hybrid 18MAR2014 18DEC2024

;

run;

proc print data=lighthouse;

run;

OUTPUT:

ObsLighthouse_IDLighthouse_NameCoastlineVisibility_RangeMaintenance_CostFailure_IncidentsPower_SourceInstallation_DateLast_Service_Date
1101Alpha_PointEast25450002Solar15JAN201512JAN2025
2102Beacon_RockWest30520001Electric10FEB201605JAN2025
3103Coral_LightSouth18380004Hybrid22MAR201420DEC2024
4104Delta_BeamEast22410003Solar11APR201718JAN2025
5105Eagle_EyeWest35610000Electric09MAY201801JAN2025
6106Falcon_TowerSouth20395002Hybrid15JUN201610DEC2024
7107Gulf_GuideEast28470001Solar12JUL201902JAN2025
8108Harbor_LightWest32580000Electric18AUG202005JAN2025
9109Island_BeamSouth16360005Hybrid21SEP201322DEC2024
10110Jetty_LightEast26455002Solar10OCT201512JAN2025
11111KeystoneWest34600001Electric15NOV201705JAN2025
12112Lagoon_LightSouth19390003Hybrid01DEC201620DEC2024
13113Marina_BeamEast27465001Solar25JAN201902JAN2025
14114Nautical_PointWest33590000Electric11FEB202105JAN2025
15115Ocean_GuardSouth21405004Hybrid18MAR201418DEC2024

8. RELIABILITY INDEX CALCULATION

data lighthouse_enriched;

    set lighthouse;

    Reliability_Index = (1 - (Failure_Incidents/10)) * 100;

run;

proc print data=lighthouse_enriched;

run;

OUTPUT:

ObsLighthouse_IDLighthouse_NameCoastlineVisibility_RangeMaintenance_CostFailure_IncidentsPower_SourceInstallation_DateLast_Service_DateReliability_Index
1101Alpha_PointEast25450002Solar15JAN201512JAN202580
2102Beacon_RockWest30520001Electric10FEB201605JAN202590
3103Coral_LightSouth18380004Hybrid22MAR201420DEC202460
4104Delta_BeamEast22410003Solar11APR201718JAN202570
5105Eagle_EyeWest35610000Electric09MAY201801JAN2025100
6106Falcon_TowerSouth20395002Hybrid15JUN201610DEC202480
7107Gulf_GuideEast28470001Solar12JUL201902JAN202590
8108Harbor_LightWest32580000Electric18AUG202005JAN2025100
9109Island_BeamSouth16360005Hybrid21SEP201322DEC202450
10110Jetty_LightEast26455002Solar10OCT201512JAN202580
11111KeystoneWest34600001Electric15NOV201705JAN202590
12112Lagoon_LightSouth19390003Hybrid01DEC201620DEC202470
13113Marina_BeamEast27465001Solar25JAN201902JAN202590
14114Nautical_PointWest33590000Electric11FEB202105JAN2025100
15115Ocean_GuardSouth21405004Hybrid18MAR201418DEC202460

·  Converts failures into percentage reliability

·  Standard KPI used in asset management

9. DATE FUNCTIONS (MDY, INTCK, INTNX)

data lighthouse_dates;

    set lighthouse_enriched;

    Next_Service_Date = intnx('month', Last_Service_Date, 6);

    Years_In_Service = intck('year', Installation_Date, today());

    format Next_Service_Date date9.;

run;

proc print data=lighthouse_dates;

 var Lighthouse_ID Lighthouse_Name Coastline Next_Service_Date Years_In_Service 

      Installation_Date;

run;

OUTPUT:

ObsLighthouse_IDLighthouse_NameCoastlineNext_Service_DateYears_In_ServiceInstallation_Date
1101Alpha_PointEast01JUL20251115JAN2015
2102Beacon_RockWest01JUL20251010FEB2016
3103Coral_LightSouth01JUN20251222MAR2014
4104Delta_BeamEast01JUL2025911APR2017
5105Eagle_EyeWest01JUL2025809MAY2018
6106Falcon_TowerSouth01JUN20251015JUN2016
7107Gulf_GuideEast01JUL2025712JUL2019
8108Harbor_LightWest01JUL2025618AUG2020
9109Island_BeamSouth01JUN20251321SEP2013
10110Jetty_LightEast01JUL20251110OCT2015
11111KeystoneWest01JUL2025915NOV2017
12112Lagoon_LightSouth01JUN20251001DEC2016
13113Marina_BeamEast01JUL2025725JAN2019
14114Nautical_PointWest01JUL2025511FEB2021
15115Ocean_GuardSouth01JUN20251218MAR2014

10. CHARACTER & NUMERIC FUNCTIONS

data lighthouse_clean;

    set lighthouse_dates;

    Lighthouse_Name = propcase(strip(Lighthouse_Name));

    Coastline = upcase(Coastline);

    Power_Source = lowcase(Power_Source);

    Full_Name = catx(' - ', Lighthouse_Name, Coastline);

    Cost_Filled = coalesce(Maintenance_Cost,0);

run;

proc print data=lighthouse_clean;

 var Lighthouse_ID Lighthouse_Name Coastline Power_Source Full_Name Cost_Filled

     Maintenance_Cost;

run;

OUTPUT:

ObsLighthouse_IDLighthouse_NameCoastlinePower_SourceFull_NameCost_FilledMaintenance_Cost
1101Alpha_pointEASTsolarAlpha_point - EAST4500045000
2102Beacon_rockWESTelectricBeacon_rock - WEST5200052000
3103Coral_lightSOUTHhybridCoral_light - SOUTH3800038000
4104Delta_beamEASTsolarDelta_beam - EAST4100041000
5105Eagle_eyeWESTelectricEagle_eye - WEST6100061000
6106Falcon_towerSOUTHhybridFalcon_tower - SOUTH3950039500
7107Gulf_guideEASTsolarGulf_guide - EAST4700047000
8108Harbor_lightWESTelectricHarbor_light - WEST5800058000
9109Island_beamSOUTHhybridIsland_beam - SOUTH3600036000
10110Jetty_lightEASTsolarJetty_light - EAST4550045500
11111KeystoneWESTelectricKeystone - WEST6000060000
12112Lagoon_lightSOUTHhybridLagoon_light - SOUTH3900039000
13113Marina_beamEASTsolarMarina_beam - EAST4650046500
14114Nautical_pointWESTelectricNautical_point - WEST5900059000
15115Ocean_guardSOUTHhybridOcean_guard - SOUTH4050040500

11. UTILIZATION CLASSIFICATION (MACRO)

%macro utilization;

data lighthouse_util;

    set lighthouse_clean;

length Utilization_Class $8.;

    if Reliability_Index >= 90 then Utilization_Class='HIGH';

    else if Reliability_Index >= 75 then Utilization_Class='MEDIUM';

    else Utilization_Class='LOW';

run;

proc print data=lighthouse_util;

run;

%mend;


%utilization;

OUTPUT:

ObsLighthouse_IDLighthouse_NameCoastlineVisibility_RangeMaintenance_CostFailure_IncidentsPower_SourceInstallation_DateLast_Service_DateReliability_IndexNext_Service_DateYears_In_ServiceFull_NameCost_FilledUtilization_Class
1101Alpha_pointEAST25450002solar15JAN201512JAN20258001JUL202511Alpha_point - EAST45000MEDIUM
2102Beacon_rockWEST30520001electric10FEB201605JAN20259001JUL202510Beacon_rock - WEST52000HIGH
3103Coral_lightSOUTH18380004hybrid22MAR201420DEC20246001JUN202512Coral_light - SOUTH38000LOW
4104Delta_beamEAST22410003solar11APR201718JAN20257001JUL20259Delta_beam - EAST41000LOW
5105Eagle_eyeWEST35610000electric09MAY201801JAN202510001JUL20258Eagle_eye - WEST61000HIGH
6106Falcon_towerSOUTH20395002hybrid15JUN201610DEC20248001JUN202510Falcon_tower - SOUTH39500MEDIUM
7107Gulf_guideEAST28470001solar12JUL201902JAN20259001JUL20257Gulf_guide - EAST47000HIGH
8108Harbor_lightWEST32580000electric18AUG202005JAN202510001JUL20256Harbor_light - WEST58000HIGH
9109Island_beamSOUTH16360005hybrid21SEP201322DEC20245001JUN202513Island_beam - SOUTH36000LOW
10110Jetty_lightEAST26455002solar10OCT201512JAN20258001JUL202511Jetty_light - EAST45500MEDIUM
11111KeystoneWEST34600001electric15NOV201705JAN20259001JUL20259Keystone - WEST60000HIGH
12112Lagoon_lightSOUTH19390003hybrid01DEC201620DEC20247001JUN202510Lagoon_light - SOUTH39000LOW
13113Marina_beamEAST27465001solar25JAN201902JAN20259001JUL20257Marina_beam - EAST46500HIGH
14114Nautical_pointWEST33590000electric11FEB202105JAN202510001JUL20255Nautical_point - WEST59000HIGH
15115Ocean_guardSOUTH21405004hybrid18MAR201418DEC20246001JUN202512Ocean_guard - SOUTH40500LOW

12. FRAUD DETECTION LOGIC (MACRO)

%macro fraud_check;

data lighthouse_fraud;

    set lighthouse_util;

    if Maintenance_Cost > 55000 and Failure_Incidents < 2 then Fraud_Flag='YES';

    else Fraud_Flag='NO';

run;

proc print data=lighthouse_fraud;

run;

%mend;


%fraud_check;

OUTPUT:

ObsLighthouse_IDLighthouse_NameCoastlineVisibility_RangeMaintenance_CostFailure_IncidentsPower_SourceInstallation_DateLast_Service_DateReliability_IndexNext_Service_DateYears_In_ServiceFull_NameCost_FilledUtilization_ClassFraud_Flag
1101Alpha_pointEAST25450002solar15JAN201512JAN20258001JUL202511Alpha_point - EAST45000MEDIUMNO
2102Beacon_rockWEST30520001electric10FEB201605JAN20259001JUL202510Beacon_rock - WEST52000HIGHNO
3103Coral_lightSOUTH18380004hybrid22MAR201420DEC20246001JUN202512Coral_light - SOUTH38000LOWNO
4104Delta_beamEAST22410003solar11APR201718JAN20257001JUL20259Delta_beam - EAST41000LOWNO
5105Eagle_eyeWEST35610000electric09MAY201801JAN202510001JUL20258Eagle_eye - WEST61000HIGHYES
6106Falcon_towerSOUTH20395002hybrid15JUN201610DEC20248001JUN202510Falcon_tower - SOUTH39500MEDIUMNO
7107Gulf_guideEAST28470001solar12JUL201902JAN20259001JUL20257Gulf_guide - EAST47000HIGHNO
8108Harbor_lightWEST32580000electric18AUG202005JAN202510001JUL20256Harbor_light - WEST58000HIGHYES
9109Island_beamSOUTH16360005hybrid21SEP201322DEC20245001JUN202513Island_beam - SOUTH36000LOWNO
10110Jetty_lightEAST26455002solar10OCT201512JAN20258001JUL202511Jetty_light - EAST45500MEDIUMNO
11111KeystoneWEST34600001electric15NOV201705JAN20259001JUL20259Keystone - WEST60000HIGHYES
12112Lagoon_lightSOUTH19390003hybrid01DEC201620DEC20247001JUN202510Lagoon_light - SOUTH39000LOWNO
13113Marina_beamEAST27465001solar25JAN201902JAN20259001JUL20257Marina_beam - EAST46500HIGHNO
14114Nautical_pointWEST33590000electric11FEB202105JAN202510001JUL20255Nautical_point - WEST59000HIGHYES
15115Ocean_guardSOUTH21405004hybrid18MAR201418DEC20246001JUN202512Ocean_guard - SOUTH40500LOWNO

13. DESCRIPTIVE STATISTICS

proc means data=lighthouse_fraud mean min max;

    var Maintenance_Cost Reliability_Index Failure_Incidents;

run;

OUTPUT:

The MEANS Procedure

VariableMeanMinimumMaximum
Maintenance_Cost
Reliability_Index
Failure_Incidents
47200.00
80.6666667
1.9333333
36000.00
50.0000000
0
61000.00
100.0000000
5.0000000

proc univariate data=lighthouse_fraud;

    var Reliability_Index;

run;

OUTPUT:

The UNIVARIATE Procedure

Variable: Reliability_Index

Moments
N15Sum Weights15
Mean80.6666667Sum Observations1210
Std Deviation15.7963227Variance249.52381
Skewness-0.5016516Kurtosis-0.7252805
Uncorrected SS101100Corrected SS3493.33333
Coeff Variation19.5822182Std Error Mean4.07859297
Basic Statistical Measures
LocationVariability
Mean80.66667Std Deviation15.79632
Median80.00000Variance249.52381
Mode90.00000Range50.00000
  Interquartile Range20.00000
Tests for Location: Mu0=0
TestStatisticp Value
Student's tt19.77806Pr > |t|<.0001
SignM7.5Pr >= |M|<.0001
Signed RankS60Pr >= |S|<.0001
Quantiles (Definition 5)
LevelQuantile
100% Max100
99%100
95%100
90%100
75% Q390
50% Median80
25% Q170
10%60
5%50
1%50
0% Min50
Extreme Observations
LowestHighest
ValueObsValueObs
5099011
60159013
6031005
70121008
70410014

14. FREQUENCY ANALYSIS

proc freq data=lighthouse_fraud;

    tables Coastline Power_Source Utilization_Class Fraud_Flag;

run;

OUTPUT:

The FREQ Procedure

CoastlineFrequencyPercentCumulative
Frequency
Cumulative
Percent
EAST533.33533.33
SOUTH533.331066.67
WEST533.3315100.00
Power_SourceFrequencyPercentCumulative
Frequency
Cumulative
Percent
electric533.33533.33
hybrid533.331066.67
solar533.3315100.00
Utilization_ClassFrequencyPercentCumulative
Frequency
Cumulative
Percent
HIGH746.67746.67
LOW533.331280.00
MEDIUM320.0015100.00
Fraud_FlagFrequencyPercentCumulative
Frequency
Cumulative
Percent
NO1173.331173.33
YES426.6715100.00

15. CORRELATION

proc corr data=lighthouse_fraud;

    var Maintenance_Cost Failure_Incidents Reliability_Index;

run;

OUTPUT:

The CORR Procedure

3 Variables:Maintenance_Cost Failure_Incidents Reliability_Index
Simple Statistics
VariableNMeanStd DevSumMinimumMaximum
Maintenance_Cost154720087077080003600061000
Failure_Incidents151.933331.5796329.0000005.00000
Reliability_Index1580.6666715.79632121050.00000100.00000
Pearson Correlation Coefficients, N = 15
Prob > |r| under H0: Rho=0
 Maintenance_CostFailure_IncidentsReliability_Index
Maintenance_Cost
1.00000
 
-0.87922
<.0001
0.87922
<.0001
Failure_Incidents
-0.87922
<.0001
1.00000
 
-1.00000
<.0001
Reliability_Index
0.87922
<.0001
-1.00000
<.0001
1.00000
 

16. SQL ANALYTICS

proc sql;

    select Coastline,

           avg(Maintenance_Cost) as Avg_Cost,

           avg(Reliability_Index) as Avg_Reliability

    from lighthouse_fraud

    group by Coastline;

quit;

OUTPUT:

CoastlineAvg_CostAvg_Reliability
EAST4500082
SOUTH3860064
WEST5800096

17. APPEND, TRANSPOSE

WHAT IS WRONG WITH THIS CODE?

Why this is incorrect

Issue

Explanation

Same BASE & DATA

You are appending a dataset into itself

Redundant

No new observations are added

Risky

In some cases, causes duplicate inflation

Interview red flag

Shows lack of data-flow understanding


**PROC APPEND is meant to combine TWO DIFFERENT datasets

·  Monthly → Yearly

·  Current → Historical

·  New batch → Master table


proc append base=lighthouse_fraud 

            data=lighthouse_fraud force;

run;

proc print data=lighthouse_fraud;

run;

OUTPUT:

ObsLighthouse_IDLighthouse_NameCoastlineVisibility_RangeMaintenance_CostFailure_IncidentsPower_SourceInstallation_DateLast_Service_DateReliability_IndexNext_Service_DateYears_In_ServiceFull_NameCost_FilledUtilization_ClassFraud_Flag
1101Alpha_pointEAST25450002solar15JAN201512JAN20258001JUL202511Alpha_point - EAST45000MEDIUMNO
2102Beacon_rockWEST30520001electric10FEB201605JAN20259001JUL202510Beacon_rock - WEST52000HIGHNO
3103Coral_lightSOUTH18380004hybrid22MAR201420DEC20246001JUN202512Coral_light - SOUTH38000LOWNO
4104Delta_beamEAST22410003solar11APR201718JAN20257001JUL20259Delta_beam - EAST41000LOWNO
5105Eagle_eyeWEST35610000electric09MAY201801JAN202510001JUL20258Eagle_eye - WEST61000HIGHYES
6106Falcon_towerSOUTH20395002hybrid15JUN201610DEC20248001JUN202510Falcon_tower - SOUTH39500MEDIUMNO
7107Gulf_guideEAST28470001solar12JUL201902JAN20259001JUL20257Gulf_guide - EAST47000HIGHNO
8108Harbor_lightWEST32580000electric18AUG202005JAN202510001JUL20256Harbor_light - WEST58000HIGHYES
9109Island_beamSOUTH16360005hybrid21SEP201322DEC20245001JUN202513Island_beam - SOUTH36000LOWNO
10110Jetty_lightEAST26455002solar10OCT201512JAN20258001JUL202511Jetty_light - EAST45500MEDIUMNO
11111KeystoneWEST34600001electric15NOV201705JAN20259001JUL20259Keystone - WEST60000HIGHYES
12112Lagoon_lightSOUTH19390003hybrid01DEC201620DEC20247001JUN202510Lagoon_light - SOUTH39000LOWNO
13113Marina_beamEAST27465001solar25JAN201902JAN20259001JUL20257Marina_beam - EAST46500HIGHNO
14114Nautical_pointWEST33590000electric11FEB202105JAN202510001JUL20255Nautical_point - WEST59000HIGHYES
15115Ocean_guardSOUTH21405004hybrid18MAR201418DEC20246001JUN202512Ocean_guard - SOUTH40500LOWNO
16101Alpha_pointEAST25450002solar15JAN201512JAN20258001JUL202511Alpha_point - EAST45000MEDIUMNO
17102Beacon_rockWEST30520001electric10FEB201605JAN20259001JUL202510Beacon_rock - WEST52000HIGHNO
18103Coral_lightSOUTH18380004hybrid22MAR201420DEC20246001JUN202512Coral_light - SOUTH38000LOWNO
19104Delta_beamEAST22410003solar11APR201718JAN20257001JUL20259Delta_beam - EAST41000LOWNO
20105Eagle_eyeWEST35610000electric09MAY201801JAN202510001JUL20258Eagle_eye - WEST61000HIGHYES
21106Falcon_towerSOUTH20395002hybrid15JUN201610DEC20248001JUN202510Falcon_tower - SOUTH39500MEDIUMNO
22107Gulf_guideEAST28470001solar12JUL201902JAN20259001JUL20257Gulf_guide - EAST47000HIGHNO
23108Harbor_lightWEST32580000electric18AUG202005JAN202510001JUL20256Harbor_light - WEST58000HIGHYES
24109Island_beamSOUTH16360005hybrid21SEP201322DEC20245001JUN202513Island_beam - SOUTH36000LOWNO
25110Jetty_lightEAST26455002solar10OCT201512JAN20258001JUL202511Jetty_light - EAST45500MEDIUMNO
26111KeystoneWEST34600001electric15NOV201705JAN20259001JUL20259Keystone - WEST60000HIGHYES
27112Lagoon_lightSOUTH19390003hybrid01DEC201620DEC20247001JUN202510Lagoon_light - SOUTH39000LOWNO
28113Marina_beamEAST27465001solar25JAN201902JAN20259001JUL20257Marina_beam - EAST46500HIGHNO
29114Nautical_pointWEST33590000electric11FEB202105JAN202510001JUL20255Nautical_point - WEST59000HIGHYES
30115Ocean_guardSOUTH21405004hybrid18MAR201418DEC20246001JUN202512Ocean_guard - SOUTH40500LOWNO

Approach 1:APPEND NEW MAINTENANCE DATA

Why this is correct

·  base= → existing master dataset

·  data=new incoming records

·  force → allows structure mismatch 

Step 1: Create a NEW dataset

data lighthouse_new;

    input Lighthouse_ID Lighthouse_Name:$25. Coastline:$10. Visibility_Range

           Maintenance_Cost Failure_Incidents Power_Source:$10.

           Installation_Date : date9. Last_Service_Date : date9.

           Reliability_Index Utilization_Class:$6. Fraud_Flag:$3.;

    format Installation_Date Last_Service_Date date9.;

datalines;

116 Pearl_Light East 29 48000 1 solar 20APR2019 10JAN2026 90 HIGH NO

117 Reef_Beam West 36 62000 4 electric 15MAY2018 12JAN2026 60 LOW YES

;

run;

proc print data=lighthouse_new;

run;

OUTPUT:

ObsLighthouse_IDLighthouse_NameCoastlineVisibility_RangeMaintenance_CostFailure_IncidentsPower_SourceInstallation_DateLast_Service_DateReliability_IndexUtilization_ClassFraud_Flag
1116Pearl_LightEast29480001solar20APR201910JAN202690HIGHNO
2117Reef_BeamWest36620004electric15MAY201812JAN202660LOWYES

Step 2: Append CORRECTLY

proc append base=lighthouse_fraud

            data=lighthouse_new force;

run;

proc print data=lighthouse_fraud;

run;

OUTPUT:

ObsLighthouse_IDLighthouse_NameCoastlineVisibility_RangeMaintenance_CostFailure_IncidentsPower_SourceInstallation_DateLast_Service_DateReliability_IndexNext_Service_DateYears_In_ServiceFull_NameCost_FilledUtilization_ClassFraud_Flag
1101Alpha_pointEAST25450002solar15JAN201512JAN20258001JUL202511Alpha_point - EAST45000MEDIUMNO
2102Beacon_rockWEST30520001electric10FEB201605JAN20259001JUL202510Beacon_rock - WEST52000HIGHNO
3103Coral_lightSOUTH18380004hybrid22MAR201420DEC20246001JUN202512Coral_light - SOUTH38000LOWNO
4104Delta_beamEAST22410003solar11APR201718JAN20257001JUL20259Delta_beam - EAST41000LOWNO
5105Eagle_eyeWEST35610000electric09MAY201801JAN202510001JUL20258Eagle_eye - WEST61000HIGHYES
6106Falcon_towerSOUTH20395002hybrid15JUN201610DEC20248001JUN202510Falcon_tower - SOUTH39500MEDIUMNO
7107Gulf_guideEAST28470001solar12JUL201902JAN20259001JUL20257Gulf_guide - EAST47000HIGHNO
8108Harbor_lightWEST32580000electric18AUG202005JAN202510001JUL20256Harbor_light - WEST58000HIGHYES
9109Island_beamSOUTH16360005hybrid21SEP201322DEC20245001JUN202513Island_beam - SOUTH36000LOWNO
10110Jetty_lightEAST26455002solar10OCT201512JAN20258001JUL202511Jetty_light - EAST45500MEDIUMNO
11111KeystoneWEST34600001electric15NOV201705JAN20259001JUL20259Keystone - WEST60000HIGHYES
12112Lagoon_lightSOUTH19390003hybrid01DEC201620DEC20247001JUN202510Lagoon_light - SOUTH39000LOWNO
13113Marina_beamEAST27465001solar25JAN201902JAN20259001JUL20257Marina_beam - EAST46500HIGHNO
14114Nautical_pointWEST33590000electric11FEB202105JAN202510001JUL20255Nautical_point - WEST59000HIGHYES
15115Ocean_guardSOUTH21405004hybrid18MAR201418DEC20246001JUN202512Ocean_guard - SOUTH40500LOWNO
16116Pearl_LightEast29480001solar20APR201910JAN202690.. .HIGHNO
17117Reef_BeamWest36620004electric15MAY201812JAN202660.. .LOWYES

APPROACH 2: SPLIT → APPEND

Step 1: Split

data east west south;

    set lighthouse_fraud;

    if Coastline='EAST' then output east;

    else if Coastline='WEST' then output west;

    else if Coastline='SOUTH' then output south;

run;

proc print data=east;

run;

OUTPUT:

ObsLighthouse_IDLighthouse_NameCoastlineVisibility_RangeMaintenance_CostFailure_IncidentsPower_SourceInstallation_DateLast_Service_DateReliability_IndexNext_Service_DateYears_In_ServiceFull_NameCost_FilledUtilization_ClassFraud_Flag
1101Alpha_pointEAST25450002solar15JAN201512JAN20258001JUL202511Alpha_point - EAST45000MEDIUMNO
2104Delta_beamEAST22410003solar11APR201718JAN20257001JUL20259Delta_beam - EAST41000LOWNO
3107Gulf_guideEAST28470001solar12JUL201902JAN20259001JUL20257Gulf_guide - EAST47000HIGHNO
4110Jetty_lightEAST26455002solar10OCT201512JAN20258001JUL202511Jetty_light - EAST45500MEDIUMNO
5113Marina_beamEAST27465001solar25JAN201902JAN20259001JUL20257Marina_beam - EAST46500HIGHNO

proc print data=west;

run;

OUTPUT:

ObsLighthouse_IDLighthouse_NameCoastlineVisibility_RangeMaintenance_CostFailure_IncidentsPower_SourceInstallation_DateLast_Service_DateReliability_IndexNext_Service_DateYears_In_ServiceFull_NameCost_FilledUtilization_ClassFraud_Flag
1102Beacon_rockWEST30520001electric10FEB201605JAN20259001JUL202510Beacon_rock - WEST52000HIGHNO
2105Eagle_eyeWEST35610000electric09MAY201801JAN202510001JUL20258Eagle_eye - WEST61000HIGHYES
3108Harbor_lightWEST32580000electric18AUG202005JAN202510001JUL20256Harbor_light - WEST58000HIGHYES
4111KeystoneWEST34600001electric15NOV201705JAN20259001JUL20259Keystone - WEST60000HIGHYES
5114Nautical_pointWEST33590000electric11FEB202105JAN202510001JUL20255Nautical_point - WEST59000HIGHYES

proc print data=south;

run;

OUTPUT:

ObsLighthouse_IDLighthouse_NameCoastlineVisibility_RangeMaintenance_CostFailure_IncidentsPower_SourceInstallation_DateLast_Service_DateReliability_IndexNext_Service_DateYears_In_ServiceFull_NameCost_FilledUtilization_ClassFraud_Flag
1103Coral_lightSOUTH18380004hybrid22MAR201420DEC20246001JUN202512Coral_light - SOUTH38000LOWNO
2106Falcon_towerSOUTH20395002hybrid15JUN201610DEC20248001JUN202510Falcon_tower - SOUTH39500MEDIUMNO
3109Island_beamSOUTH16360005hybrid21SEP201322DEC20245001JUN202513Island_beam - SOUTH36000LOWNO
4112Lagoon_lightSOUTH19390003hybrid01DEC201620DEC20247001JUN202510Lagoon_light - SOUTH39000LOWNO
5115Ocean_guardSOUTH21405004hybrid18MAR201418DEC20246001JUN202512Ocean_guard - SOUTH40500LOWNO

Step 2: Append Back

proc append base=east 

            data=west force;

run;

proc print data=east;

run;

OUTPUT:

ObsLighthouse_IDLighthouse_NameCoastlineVisibility_RangeMaintenance_CostFailure_IncidentsPower_SourceInstallation_DateLast_Service_DateReliability_IndexNext_Service_DateYears_In_ServiceFull_NameCost_FilledUtilization_ClassFraud_Flag
1101Alpha_pointEAST25450002solar15JAN201512JAN20258001JUL202511Alpha_point - EAST45000MEDIUMNO
2104Delta_beamEAST22410003solar11APR201718JAN20257001JUL20259Delta_beam - EAST41000LOWNO
3107Gulf_guideEAST28470001solar12JUL201902JAN20259001JUL20257Gulf_guide - EAST47000HIGHNO
4110Jetty_lightEAST26455002solar10OCT201512JAN20258001JUL202511Jetty_light - EAST45500MEDIUMNO
5113Marina_beamEAST27465001solar25JAN201902JAN20259001JUL20257Marina_beam - EAST46500HIGHNO
6102Beacon_rockWEST30520001electric10FEB201605JAN20259001JUL202510Beacon_rock - WEST52000HIGHNO
7105Eagle_eyeWEST35610000electric09MAY201801JAN202510001JUL20258Eagle_eye - WEST61000HIGHYES
8108Harbor_lightWEST32580000electric18AUG202005JAN202510001JUL20256Harbor_light - WEST58000HIGHYES
9111KeystoneWEST34600001electric15NOV201705JAN20259001JUL20259Keystone - WEST60000HIGHYES
10114Nautical_pointWEST33590000electric11FEB202105JAN202510001JUL20255Nautical_point - WEST59000HIGHYES

proc append base=east 

            data=south force;

run;

proc print data=east;

run;

OUTPUT:

ObsLighthouse_IDLighthouse_NameCoastlineVisibility_RangeMaintenance_CostFailure_IncidentsPower_SourceInstallation_DateLast_Service_DateReliability_IndexNext_Service_DateYears_In_ServiceFull_NameCost_FilledUtilization_ClassFraud_Flag
1101Alpha_pointEAST25450002solar15JAN201512JAN20258001JUL202511Alpha_point - EAST45000MEDIUMNO
2104Delta_beamEAST22410003solar11APR201718JAN20257001JUL20259Delta_beam - EAST41000LOWNO
3107Gulf_guideEAST28470001solar12JUL201902JAN20259001JUL20257Gulf_guide - EAST47000HIGHNO
4110Jetty_lightEAST26455002solar10OCT201512JAN20258001JUL202511Jetty_light - EAST45500MEDIUMNO
5113Marina_beamEAST27465001solar25JAN201902JAN20259001JUL20257Marina_beam - EAST46500HIGHNO
6102Beacon_rockWEST30520001electric10FEB201605JAN20259001JUL202510Beacon_rock - WEST52000HIGHNO
7105Eagle_eyeWEST35610000electric09MAY201801JAN202510001JUL20258Eagle_eye - WEST61000HIGHYES
8108Harbor_lightWEST32580000electric18AUG202005JAN202510001JUL20256Harbor_light - WEST58000HIGHYES
9111KeystoneWEST34600001electric15NOV201705JAN20259001JUL20259Keystone - WEST60000HIGHYES
10114Nautical_pointWEST33590000electric11FEB202105JAN202510001JUL20255Nautical_point - WEST59000HIGHYES
11103Coral_lightSOUTH18380004hybrid22MAR201420DEC20246001JUN202512Coral_light - SOUTH38000LOWNO
12106Falcon_towerSOUTH20395002hybrid15JUN201610DEC20248001JUN202510Falcon_tower - SOUTH39500MEDIUMNO
13109Island_beamSOUTH16360005hybrid21SEP201322DEC20245001JUN202513Island_beam - SOUTH36000LOWNO
14112Lagoon_lightSOUTH19390003hybrid01DEC201620DEC20247001JUN202510Lagoon_light - SOUTH39000LOWNO
15115Ocean_guardSOUTH21405004hybrid18MAR201418DEC20246001JUN202512Ocean_guard - SOUTH40500LOWNO

Final combined dataset is east

TRANSPOSE

proc transpose data=lighthouse_fraud out=lighthouse_t;

    var Maintenance_Cost Reliability_Index;

run;

proc print data=lighthouse_t;

run;

OUTPUT:

Obs_NAME_COL1COL2COL3COL4COL5COL6COL7COL8COL9COL10COL11COL12COL13COL14COL15COL16COL17
1Maintenance_Cost4500052000380004100061000395004700058000360004550060000390004650059000405004800062000
2Reliability_Index8090607010080901005080907090100609060

18. VISUALIZATION

proc sgplot data=lighthouse_fraud;

    vbar Coastline / response=Maintenance_Cost stat=mean;

run;

OUTPUT:

The SGPlot Procedure

19. DATASET CLEANUP

proc datasets lib=work;

    delete lighthouse_raw lighthouse_dates lighthouse_clean;

quit;

LOG:

NOTE: Deleting WORK.LIGHTHOUSE_RAW (memtype=DATA).
NOTE: Deleting WORK.LIGHTHOUSE_DATES (memtype=DATA).
NOTE: Deleting WORK.LIGHTHOUSE_CLEAN (memtype=DATA).

20. BUSINESS INSIGHTS

·  West coast lighthouses cost more but are more reliable

·  High maintenance cost with frequent failures indicates fraud risk

·  Solar-powered units show balanced cost-efficiency

·  Aging lighthouses require proactive servicing

21. 21 KEY POINTS ABOUT THIS PROJECT

1.     This project treats lighthouses as operational assets, similar to real-world infrastructure like data centers or telecom towers.

2.     A realistic dataset with more than 15 lighthouse records was created instead of using dummy or textbook data.

3.     Each lighthouse is uniquely identified, enabling tracking across multiple analytical steps.

4.     Operational variables such as visibility range, maintenance cost, power source, and failure incidents were carefully designed.

5.     Reliability Index was derived as a percentage to reflect operational health in a business-friendly metric.

6.     Intentional data errors were introduced to simulate real-world data quality issues.

7.     The project demonstrates how such errors are identified using PROC CONTENTS and logic checks.

8.     Error correction is explained step by step, mirroring actual industry practice.

9.     SAS Data Step programming is used extensively for data transformation and derivations.

10.  PROC SQL is applied for grouping, aggregation, and management-level summaries.

11.  Date functions such as MDY, INTCK, and INTNX are used to calculate service cycles and asset age.

12.  Character functions like STRIP, TRIM, UPCASE, LOWCASE, and CATX ensure clean and standardized data.

13.  Numeric functions such as COALESCE help manage missing and inconsistent values.

14.  Macros are used to automate utilization classification, improving code reusability and scalability.

15.  A rule-based fraud detection macro flags abnormal cost and failure combinations.

16.  PROC FREQ provides insight into coastline-wise and power-source-wise distributions.

17.  PROC MEANS and PROC UNIVARIATE support descriptive and distribution analysis.

18.  PROC CORR identifies relationships between maintenance cost, failures, and reliability.

19.  Data reshaping techniques using SET, MERGE, APPEND, and TRANSPOSE reflect real project workflows.

20.  Visualization using PROC SGPLOT converts analytics into decision-ready insights.

21.  The entire project is structured in an interview-ready, business-oriented format.

22. Conclusion

The Lighthouse Operations, Reliability & Fraud Analytics Using SAS project demonstrates a complete, real-world analytics lifecycle using SAS in a simple and understandable manner. Instead of focusing only on coding syntax, the project emphasizes business thinking, data quality awareness, and analytical reasoning. By modeling lighthouses as operational assets, the project mirrors how organizations monitor and manage critical infrastructure in industries such as transportation, energy, and public safety.

One of the strongest aspects of this project is the intentional inclusion of data errors. Real-world datasets are rarely clean, and the ability to identify, explain, and correct such issues is a key skill expected from a SAS programmer. The project clearly shows how incorrect data types or logic errors can affect analysis and how SAS tools can be used to resolve them effectively.

The use of macros adds automation and scalability, making the analysis efficient and reusable. Fraud detection logic highlights how analytics can go beyond reporting and support governance and compliance. Date functions, character handling, and numeric transformations further strengthen the project by addressing common data preparation challenges.

Overall, this project is not just a technical exercise but a practical demonstration of how SAS can be used to convert raw operational data into meaningful insights. It is especially valuable for interview preparation, as it showcases both programming proficiency and real-world problem-solving ability in a structured and professional way.



SAS INTERVIEW QUESTIONS

1.Why use INTCK vs INTNX?

2.How do macros help operational classification?

3.How do you detect data quality issues in SAS?


--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

About the Author:

SAS Learning Hub is a data analytics and SAS programming platform focused on clinical, financial, and real-world data analysis. The content is created by professionals with academic training in Pharmaceutics and hands-on experience in Base SAS, PROC SQL, Macros, SDTM, and ADaM, providing practical and industry-relevant SAS learning resources.


Disclaimer:

The datasets and analysis in this article are created for educational and demonstration purposes only. They do not represent LIGHT HOUSE data.


Our Mission:

This blog provides industry-focused SAS programming tutorials and analytics projects covering finance, healthcare, and technology.


This project is suitable for:

·  Students learning SAS

·  Data analysts building portfolios

·  Professionals preparing for SAS interviews

·  Bloggers writing about analytics and smart cities

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Follow Us On : 


 
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--->Follow our blog for more SAS-based analytics projects and industry data models.

---> Support Us By Following Our Blog..

To deepen your understanding of SAS analytics, please refer to our other data science and industry-focused projects listed below:

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

About Us | Contact Privacy Policy


Comments

Popular posts from this blog

397.If a satellite has excellent signal strength but very high latency, can it still deliver good quality communication? Why or why not?A Sas Study

401.How Efficient Are Global Data Centers? A Complete SAS Analytics Study

383.Which Pharma Distributors Are High-Risk? Can SAS Detect Delays, Temperature Violations, and Fraud?