Tuesday, 16 September 2025

288.REALWORLD HIGHWAYS IN INDIA PROJECT DEMONSTRATED USING PROC PRINT | PROC MEANS | PROC FREQ | PROC SQL | PROC SORT | PROC REPORT | MACROS FOR CREATION, AGGREGATION, SUMMARIES, FILTERING, QC CHECKS AND AUTOMATED REPORTING WITH FULL ANALYSIS AND VALIDATION

REALWORLD HIGHWAYS IN INDIA PROJECT DEMONSTRATED USING PROC PRINT | PROC MEANS | PROC FREQ | PROC SQL | PROC SORT | PROC REPORT | MACROS FOR CREATION, AGGREGATION, SUMMARIES, FILTERING, QC CHECKS AND AUTOMATED REPORTING WITH FULL ANALYSIS AND VALIDATION

1) Create raw dataset 

options nocenter;

data work.india_highways;

  infile datalines dlm='|' dsd truncover;

  length HWY_ID $8 HWY_NAME $60 HWY_TYPE $6 START_CITY $30 END_CITY $30 STATES $120;

  format LENGTH_KM 8.2;

  input HWY_ID $ HWY_NAME $ HWY_TYPE $ START_CITY $ END_CITY $ LENGTH_KM STATES $;

datalines;

NH001|Delhi–Mumbai Expressway|NH|Delhi|Mumbai|1400.50|Delhi,Haryana,Rajasthan,Gujarat,Maharashtra

NH002|Bangalore–Chennai Highway|NH|Bangalore|Chennai|350.75|Karnataka,Tamil_Nadu

SH101|Mumbai–Pune Road|SH|Mumbai|Pune|150.40|Maharashtra

NH003|Kolkata–Durgapur Corridor|NH|Kolkata|Durgapur|180.20|West_Bengal

NH004|Amritsar–Jalandhar Link|NH|Amritsar|Jalandhar|80.00|Punjab

NH005|Agra–Lucknow Expressway|NH|Agra|Lucknow|302.00|Uttar_Pradesh

NH006|Hyderabad–Vijayawada Road|NH|Hyderabad|Vijayawada|275.30|Telangana,Andhra_Pradesh

SH102|Goa Coastal Road|SH|Panaji|Margao|60.10|Goa

NH007|Surat–Vadodara Highway|NH|Surat|Vadodara|150.00|Gujarat

NH008|Patna–Gaya Connector|NH|Patna|Gaya|92.80|Bihar

NH009|Chandigarh–Shimla Hill Road|NH|Chandigarh|Shimla|112.50|Punjab,Himachal_Pradesh

NH010|Bhopal–Indore Expressway|NH|Bhopal|Indore|200.25|Madhya_Pradesh

SH103|Siliguri Bypass|SH|Siliguri|Bagdogra|25.00|West_Bengal

NH011|Kochi–Calicut Coastal Link|NH|Kochi|Calicut|190.60|Kerala

NH012|Visakhapatnam–Vijayanagaram|NH|Visakhapatnam|Vijayanagaram|110.75|Andhra_Pradesh

;

run;

proc print;run;

Output:

ObsHWY_IDHWY_NAMEHWY_TYPESTART_CITYEND_CITYSTATESLENGTH_KM
1NH001Delhi–Mumbai ExpresswayNHDelhiMumbaiDelhi,Haryana,Rajasthan,Gujarat,Maharashtra1400.50
2NH002Bangalore–Chennai HighwayNHBangaloreChennaiKarnataka,Tamil_Nadu350.75
3SH101Mumbai–Pune RoadSHMumbaiPuneMaharashtra150.40
4NH003Kolkata–Durgapur CorridorNHKolkataDurgapurWest_Bengal180.20
5NH004Amritsar–Jalandhar LinkNHAmritsarJalandharPunjab80.00
6NH005Agra–Lucknow ExpresswayNHAgraLucknowUttar_Pradesh302.00
7NH006Hyderabad–Vijayawada RoadNHHyderabadVijayawadaTelangana,Andhra_Pradesh275.30
8SH102Goa Coastal RoadSHPanajiMargaoGoa60.10
9NH007Surat–Vadodara HighwayNHSuratVadodaraGujarat150.00
10NH008Patna–Gaya ConnectorNHPatnaGayaBihar92.80
11NH009Chandigarh–Shimla Hill RoadNHChandigarhShimlaPunjab,Himachal_Pradesh112.50
12NH010Bhopal–Indore ExpresswayNHBhopalIndoreMadhya_Pradesh200.25
13SH103Siliguri BypassSHSiliguriBagdograWest_Bengal25.00
14NH011Kochi–Calicut Coastal LinkNHKochiCalicutKerala190.60
15NH012Visakhapatnam–VijayanagaramNHVisakhapatnamVijayanagaramAndhra_Pradesh110.75


2) Quick print to inspect (PROC PRINT) 

Purpose: Quick tabular inspection of the raw dataset to ensure correct import. 

proc print data=work.india_highways(obs=10) noobs label;

  title "India Highways - Raw Data";

run;

Output:

India Highways - Raw Data

HWY_IDHWY_NAMEHWY_TYPESTART_CITYEND_CITYSTATESLENGTH_KM
NH001Delhi–Mumbai ExpresswayNHDelhiMumbaiDelhi,Haryana,Rajasthan,Gujarat,Maharashtra1400.50
NH002Bangalore–Chennai HighwayNHBangaloreChennaiKarnataka,Tamil_Nadu350.75
SH101Mumbai–Pune RoadSHMumbaiPuneMaharashtra150.40
NH003Kolkata–Durgapur CorridorNHKolkataDurgapurWest_Bengal180.20
NH004Amritsar–Jalandhar LinkNHAmritsarJalandharPunjab80.00
NH005Agra–Lucknow ExpresswayNHAgraLucknowUttar_Pradesh302.00
NH006Hyderabad–Vijayawada RoadNHHyderabadVijayawadaTelangana,Andhra_Pradesh275.30
SH102Goa Coastal RoadSHPanajiMargaoGoa60.10
NH007Surat–Vadodara HighwayNHSuratVadodaraGujarat150.00
NH008Patna–Gaya ConnectorNHPatnaGayaBihar92.80

3) Create a numeric variable for number of states traversed using DATA step + INSTR count 

Purpose: Derive a numeric variable STATES_COUNT for aggregation and sorting. 

data work.india_highways2;

  set work.india_highways;

/*   count states by counting commas + 1 (if not missing) */

  if missing(STATES) then STATES_COUNT = 0;

  else STATES_COUNT = countw(STATES, ',');

run;

proc print;run;

Output:

ObsHWY_IDHWY_NAMEHWY_TYPESTART_CITYEND_CITYSTATESLENGTH_KMSTATES_COUNT
1NH001Delhi–Mumbai ExpresswayNHDelhiMumbaiDelhi,Haryana,Rajasthan,Gujarat,Maharashtra1400.505
2NH002Bangalore–Chennai HighwayNHBangaloreChennaiKarnataka,Tamil_Nadu350.752
3SH101Mumbai–Pune RoadSHMumbaiPuneMaharashtra150.401
4NH003Kolkata–Durgapur CorridorNHKolkataDurgapurWest_Bengal180.201
5NH004Amritsar–Jalandhar LinkNHAmritsarJalandharPunjab80.001
6NH005Agra–Lucknow ExpresswayNHAgraLucknowUttar_Pradesh302.001
7NH006Hyderabad–Vijayawada RoadNHHyderabadVijayawadaTelangana,Andhra_Pradesh275.302
8SH102Goa Coastal RoadSHPanajiMargaoGoa60.101
9NH007Surat–Vadodara HighwayNHSuratVadodaraGujarat150.001
10NH008Patna–Gaya ConnectorNHPatnaGayaBihar92.801
11NH009Chandigarh–Shimla Hill RoadNHChandigarhShimlaPunjab,Himachal_Pradesh112.502
12NH010Bhopal–Indore ExpresswayNHBhopalIndoreMadhya_Pradesh200.251
13SH103Siliguri BypassSHSiliguriBagdograWest_Bengal25.001
14NH011Kochi–Calicut Coastal LinkNHKochiCalicutKerala190.601
15NH012Visakhapatnam–VijayanagaramNHVisakhapatnamVijayanagaramAndhra_Pradesh110.751


4) Format: define LENGTH categories using PROC FORMAT 

Purpose: Create user-friendly categories for highway lengths to use in reports. 

proc format;

  value lengthcat

    low - <100 = 'Short (<100 km)'

    100 - <300 = 'Medium (100-299 km)'

    300 - high = 'Long (>=300 km)';

run;

Log:

NOTE: Format LENGTHCAT has been output.

5) PROC MEANS for numeric summary 

Purpose: Provide numerical summaries (count, mean, median, min, max, std) of numeric 

vars. 

proc means data=work.india_highways2 n mean median min max std maxdec=2;

  var LENGTH_KM STATES_COUNT;

  title "Numeric Summary: Lengths and States Count";

run;

Output:

Numeric Summary: Lengths and States Count

The MEANS Procedure

VariableNMeanMedianMinimumMaximumStd Dev
LENGTH_KM
STATES_COUNT
15
15
245.41
1.47
150.40
1.00
25.00
1.00
1400.50
5.00
332.33
1.06

6) PROC FREQ for categorical summaries 

Purpose: Frequency distributions for types and cross-tab with length categories. 

proc freq data=work.india_highways2;

  tables HWY_TYPE / nocum norow nopercent;

  format LENGTH_KM lengthcat.;

  tables LENGTH_KM*HWY_TYPE / list;

  title "Categorical Summary: Highway Type and Length categories";

run;

Output:

Categorical Summary: Highway Type and Length categories

The FREQ Procedure

HWY_TYPEFrequency
NH12
SH3
LENGTH_KMHWY_TYPEFrequencyPercentCumulative
Frequency
Cumulative
Percent
Short (<100 km)NH213.33213.33
Short (<100 km)SH213.33426.67
Medium (100-299 km)NH746.671173.33
Medium (100-299 km)SH16.671280.00
Long (>=300 km)NH320.0015100.00

7) PROC SQL: aggregate by HWY_TYPE 

Purpose: Aggregate totals and averages by highway type using PROC SQL. 

proc sql;

  create table work.type_summary as

  select HWY_TYPE,

         count(*) as Num_Highways,

         sum(LENGTH_KM) as Total_Length format=8.2,

         mean(LENGTH_KM) as Avg_Length format=8.2

  from work.india_highways2

  group by HWY_TYPE

  order by Total_Length desc;

quit;

proc print;run;

Output:

ObsHWY_TYPENum_HighwaysTotal_LengthAvg_Length
1NH123445.65287.14
2SH3235.5078.50


8) PROC SORT then PROC REPORT 

Purpose: Sort highways by length (descending) for prioritized reports. 

proc sort data=work.india_highways2 out=work.high_sorted;

  by descending LENGTH_KM;

run;

proc print;run;

Output:

ObsHWY_IDHWY_NAMEHWY_TYPESTART_CITYEND_CITYSTATESLENGTH_KMSTATES_COUNT
1NH001Delhi–Mumbai ExpresswayNHDelhiMumbaiDelhi,Haryana,Rajasthan,Gujarat,Maharashtra1400.505
2NH002Bangalore–Chennai HighwayNHBangaloreChennaiKarnataka,Tamil_Nadu350.752
3NH005Agra–Lucknow ExpresswayNHAgraLucknowUttar_Pradesh302.001
4NH006Hyderabad–Vijayawada RoadNHHyderabadVijayawadaTelangana,Andhra_Pradesh275.302
5NH010Bhopal–Indore ExpresswayNHBhopalIndoreMadhya_Pradesh200.251
6NH011Kochi–Calicut Coastal LinkNHKochiCalicutKerala190.601
7NH003Kolkata–Durgapur CorridorNHKolkataDurgapurWest_Bengal180.201
8SH101Mumbai–Pune RoadSHMumbaiPuneMaharashtra150.401
9NH007Surat–Vadodara HighwayNHSuratVadodaraGujarat150.001
10NH009Chandigarh–Shimla Hill RoadNHChandigarhShimlaPunjab,Himachal_Pradesh112.502
11NH012Visakhapatnam–VijayanagaramNHVisakhapatnamVijayanagaramAndhra_Pradesh110.751
12NH008Patna–Gaya ConnectorNHPatnaGayaBihar92.801
13NH004Amritsar–Jalandhar LinkNHAmritsarJalandharPunjab80.001
14SH102Goa Coastal RoadSHPanajiMargaoGoa60.101
15SH103Siliguri BypassSHSiliguriBagdograWest_Bengal25.001


Purpose: Produce a formatted report combining multiple variables for a compact summary. 

proc report data=work.high_sorted nowd;

  columns HWY_ID HWY_NAME HWY_TYPE START_CITY END_CITY LENGTH_KM STATES_COUNT STATES;

  define LENGTH_KM / analysis sum format=8.2 'Length (km)';

  define STATES_COUNT / analysis sum 'States Count';

  title "Detailed Highway Report (sorted by length)";

run;

Output:

Detailed Highway Report (sorted by length)

HWY_IDHWY_NAMEHWY_TYPESTART_CITYEND_CITYLength (km)States CountSTATES
NH001Delhi–Mumbai ExpresswayNHDelhiMumbai1400.505Delhi,Haryana,Rajasthan,Gujarat,Maharashtra
NH002Bangalore–Chennai HighwayNHBangaloreChennai350.752Karnataka,Tamil_Nadu
NH005Agra–Lucknow ExpresswayNHAgraLucknow302.001Uttar_Pradesh
NH006Hyderabad–Vijayawada RoadNHHyderabadVijayawada275.302Telangana,Andhra_Pradesh
NH010Bhopal–Indore ExpresswayNHBhopalIndore200.251Madhya_Pradesh
NH011Kochi–Calicut Coastal LinkNHKochiCalicut190.601Kerala
NH003Kolkata–Durgapur CorridorNHKolkataDurgapur180.201West_Bengal
SH101Mumbai–Pune RoadSHMumbaiPune150.401Maharashtra
NH007Surat–Vadodara HighwayNHSuratVadodara150.001Gujarat
NH009Chandigarh–Shimla Hill RoadNHChandigarhShimla112.502Punjab,Himachal_Pradesh
NH012Visakhapatnam–VijayanagaramNHVisakhapatnamVijayanagaram110.751Andhra_Pradesh
NH008Patna–Gaya ConnectorNHPatnaGaya92.801Bihar
NH004Amritsar–Jalandhar LinkNHAmritsarJalandhar80.001Punjab
SH102Goa Coastal RoadSHPanajiMargao60.101Goa
SH103Siliguri BypassSHSiliguriBagdogra25.001West_Bengal

9) Macro: filter dataset by type and produce summary (macro parameter: type=) 

Purpose: Reusable macro to create a filtered table and numeric summary for a given 

highway type. 

%macro summary_by_type(type=NH);

  %put NOTE: Running summary for type=&type;

  proc sql;

    create table work.&type._highways as

    select * from work.india_highways2

    where upcase(HWY_TYPE)=upcase("&type")

    order by LENGTH_KM desc;

  quit;


  proc means data=work.&type._highways n mean median min max maxdec=2;

    var LENGTH_KM STATES_COUNT;

    title "Macro Summary: Type=&type - Numeric Summary";

  run;


  proc print data=work.&type._highways noobs;

    title "Macro Output: All highways of type &type";

  run;

%mend summary_by_type;


%summary_by_type(type=NH);

Output:

Macro Summary: Type=NH - Numeric Summary

The MEANS Procedure

VariableNMeanMedianMinimumMaximum
LENGTH_KM
STATES_COUNT
12
12
287.14
1.58
185.40
1.00
80.00
1.00
1400.50
5.00

Macro Output: All highways of type NH

HWY_IDHWY_NAMEHWY_TYPESTART_CITYEND_CITYSTATESLENGTH_KMSTATES_COUNT
NH001Delhi–Mumbai ExpresswayNHDelhiMumbaiDelhi,Haryana,Rajasthan,Gujarat,Maharashtra1400.505
NH002Bangalore–Chennai HighwayNHBangaloreChennaiKarnataka,Tamil_Nadu350.752
NH005Agra–Lucknow ExpresswayNHAgraLucknowUttar_Pradesh302.001
NH006Hyderabad–Vijayawada RoadNHHyderabadVijayawadaTelangana,Andhra_Pradesh275.302
NH010Bhopal–Indore ExpresswayNHBhopalIndoreMadhya_Pradesh200.251
NH011Kochi–Calicut Coastal LinkNHKochiCalicutKerala190.601
NH003Kolkata–Durgapur CorridorNHKolkataDurgapurWest_Bengal180.201
NH007Surat–Vadodara HighwayNHSuratVadodaraGujarat150.001
NH009Chandigarh–Shimla Hill RoadNHChandigarhShimlaPunjab,Himachal_Pradesh112.502
NH012Visakhapatnam–VijayanagaramNHVisakhapatnamVijayanagaramAndhra_Pradesh110.751
NH008Patna–Gaya ConnectorNHPatnaGayaBihar92.801
NH004Amritsar–Jalandhar LinkNHAmritsarJalandharPunjab80.001

%summary_by_type(type=SH);

Output:

Macro Summary: Type=SH - Numeric Summary

The MEANS Procedure

VariableNMeanMedianMinimumMaximum
LENGTH_KM
STATES_COUNT
3
3
78.50
1.00
60.10
1.00
25.00
1.00
150.40
1.00

Macro Output: All highways of type SH

HWY_IDHWY_NAMEHWY_TYPESTART_CITYEND_CITYSTATESLENGTH_KMSTATES_COUNT
SH101Mumbai–Pune RoadSHMumbaiPuneMaharashtra150.401
SH102Goa Coastal RoadSHPanajiMargaoGoa60.101
SH103Siliguri BypassSHSiliguriBagdograWest_Bengal25.001

10) Macro: create LENGTH_CATEGORY var and output a crosswalk table 

Purpose: Macro to derive a categorical length variable and summarize counts/averages. 

%macro create_length_cat(in=work.india_highways2, out=work.high_cat);

  data &out;

    set &in;

    if LENGTH_KM < 100 then LENGTH_CAT='Short (<100)';

    else if LENGTH_KM < 300 then LENGTH_CAT='Medium (100-299)';

    else LENGTH_CAT='Long (>=300)';

  run;


proc sql;

    create table &out._summary as

    select LENGTH_CAT, count(*) as Count_Hwys, mean(LENGTH_KM) as Avg_Length format=8.2

    from &out

    group by LENGTH_CAT;

  quit;

proc print;run;

%mend create_length_cat;


%create_length_cat(in=work.india_highways2, out=work.high_cat);

Output:

ObsLENGTH_CATCount_HwysAvg_Length
1Long (>=300)3684.42
2Medium (100-8171.25
3Short (<100)464.48


11) PROC SQL to find top 5 longest highways 

Purpose: Extract top N (here 5) longest highways for focused review. 

proc sql outobs=5;

  create table work.top5_longest as

  select HWY_ID, HWY_NAME, LENGTH_KM, STATES

  from work.india_highways2

  order by LENGTH_KM desc;

quit;

proc print;run;

Output:

ObsHWY_IDHWY_NAMELENGTH_KMSTATES
1NH001Delhi–Mumbai Expressway1400.50Delhi,Haryana,Rajasthan,Gujarat,Maharashtra
2NH002Bangalore–Chennai Highway350.75Karnataka,Tamil_Nadu
3NH005Agra–Lucknow Expressway302.00Uttar_Pradesh
4NH006Hyderabad–Vijayawada Road275.30Telangana,Andhra_Pradesh
5NH010Bhopal–Indore Expressway200.25Madhya_Pradesh


12) Basic QC: check duplicates on HWY_ID 

Purpose: Identify duplicate HWY_IDs (if any) for data integrity checks. 

proc sort data=work.india_highways2 out=work.qc_sorted nodupkey dupout=work.dup_hwys;

  by HWY_ID;

run;

proc print data=work.dup_hwys noobs;

  title "QC: Duplicate HWY_IDs (if present)";

run;

Log:

NOTE: No observations in data set WORK.DUP_HWYS.




To Visit Our Previous Online Streaming Flatform Dataset:Click Here
To Visit Our Previous Statistical Evaluation Of Clinical Trials:Click Here
To Visit Our Previous Unlocking Retail Insights Dataset:Click Here
To Visit Our Previous Sas Interview Questions-1:Click Here



Follow Us On : 


 


--- FOLLOW OUR BLOG FOR MORE INFORMATION.

--->PLEASE DO COMMENTS AND SHARE OUR BLOG.



No comments:

Post a Comment