Data-Driven Shipping Excellence: Detecting Errors and Boosting Route Performance with SAS

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

HERE IN THIS PROJECT WE USED THESE SAS STATEMENTS —DATA | SET | MERGE | INPUT | DATALINES | IF | BY | OUTPUT | PROC SORT | PROC APPEND | PROC TRANSPOSE | PROC DATASETS | RUN | %MACRO | %MEND | FUNCTIONS | R LANGUAGE

Table of Contents

1. Introduction

2. Business Context

3. Raw Dataset Creation (SAS & R)

4. Variables Description

5. Intentional Errors Dataset

6. Error Detection & Fixing

7. SAS Implementation Using PROG1

8. Data Cleaning Techniques

9. Derived Variables

10. Efficiency Calculation

11. Sorting & Merging

12. SQL Analysis

13. Transpose & Reporting

14. Macros Implementation

15. Visualization (SGPLOT)

16. Final Clean Dataset

17. Key Insights

18. 20 Key Points About The Project

19. Summary

20. Conclusion

1. Introduction

Global cargo shipping plays a critical role in enabling international trade, connecting continents, and supporting the global economy. Every day, thousands of vessels transport goods such as oil, food products, machinery, and raw materials across vast ocean routes. Managing these operations efficiently requires accurate and well-structured data related to shipping routes, fuel consumption, cargo capacity, weather conditions, and operational costs.

However, in real-world scenarios, shipping data is often prone to inconsistencies such as missing values, incorrect entries, unrealistic efficiency percentages, and data type mismatches. These issues can lead to poor decision-making, increased fuel costs, delays in delivery, and reduced operational efficiency. Therefore, ensuring high-quality data is essential for logistics optimization and business success.

This project focuses on creating a simulated global cargo shipping dataset and applying Advanced SAS Programming techniques to handle such challenges. Using Sas statements, we will demonstrate how to identify data errors, clean and transform datasets, and derive meaningful insights such as route efficiency and cost optimization. Additionally, intentional errors are introduced to replicate real-world data issues, allowing us to showcase practical debugging and correction strategies.

By the end of this project, we will have a clean, optimized dataset and a deeper understanding of how SAS can be used as a powerful tool for data validation, transformation, and analytics in the shipping and logistics domain.

2. Business Context

Shipping companies rely on accurate route data to:

Optimize fuel consumption
Minimize weather delays
Improve route efficiency
Reduce operational costs

Poor data quality leads to:

Wrong decisions
Financial losses
Inefficient logistics

3. Raw Dataset (SAS)

data cargo_raw;

input Route_ID $ Distance_nm Fuel_Consumption Cargo_Tons Weather_Delay

Route_Efficiency Fees;

datalines;

R001 1200 500 200 2 85 10000

R002 1500 700 300 3 78 15000

R003 1800 800 350 5 90 20000

R004 2000 900 400 7 95 25000

R005 2200 950 450 8 92 27000

R006 2500 1000 500 10 88 30000

R007 2700 1100 550 12 80 32000

R008 3000 1200 600 15 75 35000

R009 3200 1300 650 18 70 38000

R010 3500 1400 700 20 65 40000

R011 3700 1500 750 22 60 42000

R012 4000 1600 800 25 55 45000

;

run;

proc print data=cargo_raw;

run;

OUTPUT:


Obs	Route_ID	Distance_nm	Fuel_Consumption	Cargo_Tons	Weather_Delay	Route_Efficiency	Fees
1	R001	1200	500	200	2	85	10000
2	R002	1500	700	300	3	78	15000
3	R003	1800	800	350	5	90	20000
4	R004	2000	900	400	7	95	25000
5	R005	2200	950	450	8	92	27000
6	R006	2500	1000	500	10	88	30000
7	R007	2700	1100	550	12	80	32000
8	R008	3000	1200	600	15	75	35000
9	R009	3200	1300	650	18	70	38000
10	R010	3500	1400	700	20	65	40000
11	R011	3700	1500	750	22	60	42000
12	R012	4000	1600	800	25	55	45000

Explanation (SAS)

· DATA cargo_raw; → Creates dataset

· INPUT → Defines structure (same as R columns)

· DATALINES → Inline raw data entry

· $ → Character variable (Route_ID)

· Remaining variables → Numeric

Why used?

· Fundamental dataset creation in SAS

4. Raw Dataset in R

cargo_raw <- data.frame(

Route_ID = c("R001","R002","R003","R004","R005","R006",

"R007","R008","R009","R010","R011","R012"),

Distance_nm = c(1200,1500,1800,2000,2200,2500,

2700,3000,3200,3500,3700,4000),

Fuel_Consumption = c(500,700,800,900,950,1000,

1100,1200,1300,1400,1500,1600),

Cargo_Tons = c(200,300,350,400,450,500,

550,600,650,700,750,800),

Weather_Delay = c(2,3,5,7,8,10,

12,15,18,20,22,25),

Route_Efficiency = c(85,78,90,95,92,88,

80,75,70,65,60,55),

Fees = c(10000,15000,20000,25000,27000,30000,

32000,35000,38000,40000,42000,45000)

)

OUTPUT:

	Route_ID	Distance_nm	Fuel_Consumption	Cargo_Tons	Weather_Delay	Route_Efficiency	Fees
1	R001	1200	500	200	2	85	10000
2	R002	1500	700	300	3	78	15000
3	R003	1800	800	350	5	90	20000
4	R004	2000	900	400	7	95	25000
5	R005	2200	950	450	8	92	27000
6	R006	2500	1000	500	10	88	30000
7	R007	2700	1100	550	12	80	32000
8	R008	3000	1200	600	15	75	35000
9	R009	3200	1300	650	18	70	38000
10	R010	3500	1400	700	20	65	40000
11	R011	3700	1500	750	22	60	42000
12	R012	4000	1600	800	25	55	45000

Explanation (R)

· data.frame() → Creates tabular dataset

· Each column matches SAS exactly

· Order of observations is identical

· Data types:

· Character → Route_ID

· Numeric → all others

5. Intentional Error Dataset

data cargo_error;

set cargo_raw;

if Route_ID='R005' then Fuel_Consumption=.;

if Route_ID='R007' then Distance_nm='ABC';

if Route_ID='R009' then Route_Efficiency=150;

run;

proc print data=cargo_error;

run;

OUTPUT:


Obs	Route_ID	Distance_nm	Fuel_Consumption	Cargo_Tons	Weather_Delay	Route_Efficiency	Fees
1	R001	1200	500	200	2	85	10000
2	R002	1500	700	300	3	78	15000
3	R003	1800	800	350	5	90	20000
4	R004	2000	900	400	7	95	25000
5	R005	2200	.	450	8	92	27000
6	R006	2500	1000	500	10	88	30000
7	R007	.	1100	550	12	80	32000
8	R008	3000	1200	600	15	75	35000
9	R009	3200	1300	650	18	150	38000
10	R010	3500	1400	700	20	65	40000
11	R011	3700	1500	750	22	60	42000
12	R012	4000	1600	800	25	55	45000

Errors Explanation

1. Missing Fuel Consumption

. indicates missing value

2. Character in Numeric Field

'ABC' in numeric variable

3. Invalid Efficiency

Efficiency > 100

6. Error Correction Code

data cargo_clean;

set cargo_error;

if Fuel_Consumption=. then Fuel_Consumption=avg(500,700,800);

if notdigit(Distance_nm)>0 then Distance_nm=2000;

if Route_Efficiency>100 then Route_Efficiency=100;

run;

proc print data=cargo_clean;

run;

LOG:

ERROR 68-185: The function AVG is unknown, or cannot be accessed.

data cargo_clean;

set cargo_error;

/* Fix missing fuel consumption */

if Fuel_Consumption = . then Fuel_Consumption = mean(500,700,800);

/* Fix invalid numeric values */

if notdigit(strip(put(Distance_nm, best.))) > 0 then Distance_nm = 2000;

/* Fix efficiency > 100 */

if Route_Efficiency > 100 then Route_Efficiency = 100;

run;

proc print data=cargo_clean;

run;

OUTPUT:


Obs	Route_ID	Distance_nm	Fuel_Consumption	Cargo_Tons	Weather_Delay	Route_Efficiency	Fees
1	R001	1200	500.00	200	2	85	10000
2	R002	1500	700.00	300	3	78	15000
3	R003	1800	800.00	350	5	90	20000
4	R004	2000	900.00	400	7	95	25000
5	R005	2200	666.67	450	8	92	27000
6	R006	2500	1000.00	500	10	88	30000
7	R007	2000	1100.00	550	12	80	32000
8	R008	3000	1200.00	600	15	75	35000
9	R009	3200	1300.00	650	18	100	38000
10	R010	3500	1400.00	700	20	65	40000
11	R011	3700	1500.00	750	22	60	42000
12	R012	4000	1600.00	800	25	55	45000

Explanation

1. Fix for AVG Error

Wrong:

avg(500,700,800)

Correct:

mean(500,700,800)

Why?

MEAN() is a DATA step function
It calculates average of values
Automatically ignores missing values

2. NOTDIGIT Issue Fix

Original:

notdigit(Distance_nm)

Problem:

NOTDIGIT() works only on character variables
Distance_nm is numeric → causes logical issue

Fixed:

notdigit(strip(put(Distance_nm, best.)))

Why?

PUT() → converts numeric → character
STRIP() → removes spaces
NOTDIGIT() → checks invalid characters

3. Efficiency Correction

if Route_Efficiency > 100 then Route_Efficiency = 100;

Why?

Efficiency cannot exceed 100%
Ensures realistic business rule

Key Learnings

AVG() → Not valid in DATA step
MEAN() → Correct function
NOTDIGIT() → Only for character variables
Always validate data types before applying functions

7. Derived Variables

data cargo_derived;

set cargo_clean;

Fuel_Efficiency = Distance_nm / Fuel_Consumption;

Delay_Impact = Weather_Delay * 2;

run;

proc print data=cargo_derived;

run;

OUTPUT:


Obs	Route_ID	Distance_nm	Fuel_Consumption	Cargo_Tons	Weather_Delay	Route_Efficiency	Fees	Fuel_Efficiency	Delay_Impact
1	R001	1200	500.00	200	2	85	10000	2.40000	4
2	R002	1500	700.00	300	3	78	15000	2.14286	6
3	R003	1800	800.00	350	5	90	20000	2.25000	10
4	R004	2000	900.00	400	7	95	25000	2.22222	14
5	R005	2200	666.67	450	8	92	27000	3.30000	16
6	R006	2500	1000.00	500	10	88	30000	2.50000	20
7	R007	2000	1100.00	550	12	80	32000	1.81818	24
8	R008	3000	1200.00	600	15	75	35000	2.50000	30
9	R009	3200	1300.00	650	18	100	38000	2.46154	36
10	R010	3500	1400.00	700	20	65	40000	2.50000	40
11	R011	3700	1500.00	750	22	60	42000	2.46667	44
12	R012	4000	1600.00	800	25	55	45000	2.50000	50

Why?

· Helps business analysis

8. Sorting

proc sort data=cargo_derived;

by descending Route_Efficiency;

run;

proc print data=cargo_derived;

run;

OUTPUT:


Obs	Route_ID	Distance_nm	Fuel_Consumption	Cargo_Tons	Weather_Delay	Route_Efficiency	Fees	Fuel_Efficiency	Delay_Impact
1	R009	3200	1300.00	650	18	100	38000	2.46154	36
2	R004	2000	900.00	400	7	95	25000	2.22222	14
3	R005	2200	666.67	450	8	92	27000	3.30000	16
4	R003	1800	800.00	350	5	90	20000	2.25000	10
5	R006	2500	1000.00	500	10	88	30000	2.50000	20
6	R001	1200	500.00	200	2	85	10000	2.40000	4
7	R007	2000	1100.00	550	12	80	32000	1.81818	24
8	R002	1500	700.00	300	3	78	15000	2.14286	6
9	R008	3000	1200.00	600	15	75	35000	2.50000	30
10	R010	3500	1400.00	700	20	65	40000	2.50000	40
11	R011	3700	1500.00	750	22	60	42000	2.46667	44
12	R012	4000	1600.00	800	25	55	45000	2.50000	50

Key Points

DATA Step

· Creates dataset

· Used for transformation

SET

· Reads existing dataset

IF

· Conditional logic

PROC SORT

· Orders data

PROC SQL

· Advanced querying

MACROS

· Reusability

9. Merge Example

data extra;

input Route_ID $ Region $;

datalines;

R001 Asia

R002 Europe

;

run;

proc print data=extra;

run;

OUTPUT:


Obs	Route_ID	Region
1	R001	Asia
2	R002	Europe

proc sort data=cargo_derived;

by Route_Efficiency;

run;

proc print data=cargo_derived;

run;

OUTPUT:


Obs	Route_ID	Distance_nm	Fuel_Consumption	Cargo_Tons	Weather_Delay	Route_Efficiency	Fees	Fuel_Efficiency	Delay_Impact
1	R012	4000	1600.00	800	25	55	45000	2.50000	50
2	R011	3700	1500.00	750	22	60	42000	2.46667	44
3	R010	3500	1400.00	700	20	65	40000	2.50000	40
4	R008	3000	1200.00	600	15	75	35000	2.50000	30
5	R002	1500	700.00	300	3	78	15000	2.14286	6
6	R007	2000	1100.00	550	12	80	32000	1.81818	24
7	R001	1200	500.00	200	2	85	10000	2.40000	4
8	R006	2500	1000.00	500	10	88	30000	2.50000	20
9	R003	1800	800.00	350	5	90	20000	2.25000	10
10	R005	2200	666.67	450	8	92	27000	3.30000	16
11	R004	2000	900.00	400	7	95	25000	2.22222	14
12	R009	3200	1300.00	650	18	100	38000	2.46154	36

proc sort data=extra;

by Route_Efficiency;

run;

proc print data=extra;

run;

OUTPUT:


Obs	Route_ID	Region
1	R001	Asia
2	R002	Europe

data merged_data;

merge cargo_derived extra;

by Route_ID;

run;

proc print data=merged_data;

run;

OUTPUT:


Obs	Route_ID	Distance_nm	Fuel_Consumption	Cargo_Tons	Weather_Delay	Route_Efficiency	Fees	Fuel_Efficiency	Delay_Impact	Region
1	R001	.	.	.	.	.	.	.	.	Asia
2	R002	.	.	.	.	.	.	.	.	Europe

10. PROC SQL

proc sql;

select Route_ID, avg(Fuel_Consumption) as AvgFuel

from cargo_clean

group by Route_ID;

quit;

OUTPUT:


Route_ID	AvgFuel
R001	500
R002	700
R003	800
R004	900
R005	666.6667
R006	1000
R007	1100
R008	1200
R009	1300
R010	1400
R011	1500
R012	1600

11. PROC TRANSPOSE

proc transpose data=cargo_clean out=transposed;

var Distance_nm Fuel_Consumption;

run;

proc print data=transposed;

run;

OUTPUT:


Obs	_NAME_	COL1	COL2	COL3	COL4	COL5	COL6	COL7	COL8	COL9	COL10	COL11	COL12
1	Distance_nm	1200	1500	1800	2000	2200.00	2500	2000	3000	3200	3500	3700	4000
2	Fuel_Consumption	500	700	800	900	666.67	1000	1100	1200	1300	1400	1500	1600

12. PROC MEANS

proc means data=cargo_clean;

var Distance_nm Fuel_Consumption Cargo_Tons;

run;

OUTPUT:

The MEANS Procedure


Variable	N	Mean	Std Dev	Minimum	Maximum
Distance_nm Fuel_Consumption Cargo_Tons	12 12 12	2550.00 1055.56 520.8333333	913.0368906 353.4343290 187.6388375	1200.00 500.0000000 200.0000000	4000.00 1600.00 800.0000000

13. PROC FREQ

proc freq data=cargo_clean;

tables Weather_Delay;

run;

OUTPUT:

The FREQ Procedure


Weather_Delay	Frequency	Percent	Cumulative Frequency	Cumulative Percent
2	1	8.33	1	8.33
3	1	8.33	2	16.67
5	1	8.33	3	25.00
7	1	8.33	4	33.33
8	1	8.33	5	41.67
10	1	8.33	6	50.00
12	1	8.33	7	58.33
15	1	8.33	8	66.67
18	1	8.33	9	75.00
20	1	8.33	10	83.33
22	1	8.33	11	91.67
25	1	8.33	12	100.00

14. PROC SGPLOT

proc sgplot data=cargo_clean;

scatter x=Distance_nm y=Fuel_Consumption;

run;

OUTPUT:

15. Macro Implementation

%macro summary;

proc means data=cargo_clean;

run;

%mend;

%summary;

OUTPUT:

The MEANS Procedure


Variable	N	Mean	Std Dev	Minimum	Maximum
Distance_nm Fuel_Consumption Cargo_Tons Weather_Delay Route_Efficiency Fees	12 12 12 12 12 12	2550.00 1055.56 520.8333333 12.2500000 80.2500000 29916.67	913.0368906 353.4343290 187.6388375 7.7003542 14.2581715 10958.25	1200.00 500.0000000 200.0000000 2.0000000 55.0000000 10000.00	4000.00 1600.00 800.0000000 25.0000000 100.0000000 45000.00

16. PROC APPEND

proc append base=cargo_clean

data=extra Force;

run;

proc print data=cargo_clean;

run;

OUTPUT:


Obs	Route_ID	Distance_nm	Fuel_Consumption	Cargo_Tons	Weather_Delay	Route_Efficiency	Fees
1	R001	1200	500.00	200	2	85	10000
2	R002	1500	700.00	300	3	78	15000
3	R003	1800	800.00	350	5	90	20000
4	R004	2000	900.00	400	7	95	25000
5	R005	2200	666.67	450	8	92	27000
6	R006	2500	1000.00	500	10	88	30000
7	R007	2000	1100.00	550	12	80	32000
8	R008	3000	1200.00	600	15	75	35000
9	R009	3200	1300.00	650	18	100	38000
10	R010	3500	1400.00	700	20	65	40000
11	R011	3700	1500.00	750	22	60	42000
12	R012	4000	1600.00	800	25	55	45000
13	R001	.	.	.	.	.	.
14	R002	.	.	.	.	.	.

17. PROC DATASETS

proc datasets library=work;

delete cargo_error;

run;

LOG:

NOTE: Deleting WORK.CARGO_ERROR (memtype=DATA).

18. Final Clean Dataset Code

data final_cargo;

set cargo_raw;

if Fuel_Consumption=. then Fuel_Consumption=600;

if Route_Efficiency>100 then Route_Efficiency=100;

Fuel_Efficiency = Distance_nm / Fuel_Consumption;

Total_Cost = Fees + (Fuel_Consumption * 10);

run;

proc print data=final_cargo;

run;

OUTPUT:


Obs	Route_ID	Distance_nm	Fuel_Consumption	Cargo_Tons	Weather_Delay	Route_Efficiency	Fees	Fuel_Efficiency	Total_Cost
1	R001	1200	500	200	2	85	10000	2.40000	15000
2	R002	1500	700	300	3	78	15000	2.14286	22000
3	R003	1800	800	350	5	90	20000	2.25000	28000
4	R004	2000	900	400	7	95	25000	2.22222	34000
5	R005	2200	950	450	8	92	27000	2.31579	36500
6	R006	2500	1000	500	10	88	30000	2.50000	40000
7	R007	2700	1100	550	12	80	32000	2.45455	43000
8	R008	3000	1200	600	15	75	35000	2.50000	47000
9	R009	3200	1300	650	18	70	38000	2.46154	51000
10	R010	3500	1400	700	20	65	40000	2.50000	54000
11	R011	3700	1500	750	22	60	42000	2.46667	57000
12	R012	4000	1600	800	25	55	45000	2.50000	61000

19. Key Insights

· High distance routes → higher fuel usage

· Weather delays impact efficiency

· Incorrect data leads to wrong cost estimation

· Cleaning improves decision accuracy

20.20 Key Points About The Project

The project simulates a global cargo shipping dataset representing real-world logistics operations across multiple routes.
It includes critical variables such as Distance, Fuel Consumption, Cargo Tons, Weather Delay, Route Efficiency, and Fees.
The dataset is created in both SAS and R with identical structures, ensuring cross-platform consistency.
Intentional data errors are introduced to mimic real-world data quality issues like missing values and invalid entries.
Advanced SAS DATA step techniques are used to identify missing and incorrect values efficiently.
Errors such as invalid numeric values and unrealistic efficiency percentages (>100%) are detected and corrected.
The project demonstrates proper use of PROG1 statements including DATA, SET, IF, MERGE, and PROC SORT.
Incorrect function usage (like AVG() in DATA step) is identified and corrected using appropriate SAS functions like MEAN().
Data type mismatches are handled using functions like PUT(), INPUT(), and NOTDIGIT().
Missing values are imputed using logical replacements and statistical averages.
Derived variables such as Fuel Efficiency and Delay Impact are created to enhance analytical insights.
PROC SORT is used to organize routes based on efficiency, aiding decision-making.
PROC SQL is leveraged for aggregation, filtering, and advanced querying of shipping data.
PROC MEANS and PROC FREQ are used for statistical summaries and frequency analysis.
PROC TRANSPOSE reshapes data for reporting and visualization flexibility.
SAS Macros are implemented to automate repetitive processes, improving efficiency and scalability.
Data merging techniques are used to integrate additional route-level information like regions.
Visualization using PROC SGPLOT helps identify relationships between distance and fuel consumption.
The cleaned dataset enables better cost optimization by analyzing fuel usage and route delays.
Overall, the project showcases how SAS programming can transform raw, error-prone data into high-quality, decision-ready datasets for logistics optimization.

Project Insights

This project demonstrates:

Data Cleaning
Error Handling
Data Transformation
Business Insight Generation
Automation using Macros

21. Summary

This project demonstrates how Advanced SAS Programming can effectively handle real-world cargo shipping data challenges. A raw dataset was created containing route-level shipping metrics such as distance, fuel consumption, and efficiency. Intentional errors like missing values, invalid data types, and unrealistic values were introduced to simulate real-world data quality issues. Using PROG1 SAS statements, these errors were identified and corrected systematically. Various procedures such as PROC SORT, PROC SQL, PROC MEANS, and PROC TRANSPOSE were applied to analyze and restructure the data. Derived variables such as fuel efficiency and delay impact were calculated to provide deeper business insights. Macros were used to automate repetitive tasks, improving efficiency. Overall, the project highlights how structured SAS programming enhances data quality, supports accurate analytics, and enables better decision-making in global shipping operations.

22. Conclusion

In the shipping industry, data accuracy is critical for operational success and cost optimization. This project illustrates how Advanced SAS techniques can detect inconsistencies, correct errors, and enhance data usability. By leveraging Sas statements, we built a robust workflow that transforms raw, error-prone data into a clean and analyzable dataset. The inclusion of intentional errors helped demonstrate practical debugging strategies that are highly relevant in real-world scenarios. Furthermore, advanced procedures and macros enabled scalable and efficient data processing. The final dataset provided meaningful insights into route efficiency, fuel consumption, and delay impacts, allowing stakeholders to make informed decisions. This project not only strengthens SAS programming skills but also showcases its importance in solving complex business problems in logistics and global shipping.

INTERVIEW QUESTIONS FOR YOU

1. The RETAIN Statement

Question: What does the `RETAIN` statement do, and how is it different from a regular assignment?
Short Answer: By default, SAS resets variables to missing at the start of each iteration of the DATA step. The `RETAIN` statement tells SAS to hold the value of a variable from the previous observation. I use it most often for calculating cumulative totals (running sums) or carrying a non-missing value forward across multiple rows.

2. FIRST. and LAST. Variables

Question: How do `FIRST.variable` and `LAST.variable` work, and what is required to use them?
Short Answer: These are temporary 'flag' variables created by SAS during a BY-group processing in a DATA step. `FIRST.var` is 1 for the first record in a group, and `LAST.var` is 1 for the final record. To use them, the data must be sorted by that variable first. They are essential for identifying duplicates or performing calculations within specific groups .

3. DROP vs. KEEP (Statement vs. Dataset Option)

Question: What is the difference between using `DROP` as a statement versus a dataset option?

Short Answer: A `DROP` statement applies to the entire DATA step and removes variables from the final output dataset. A dataset option (e.g., `(DROP=var)`) is more efficient because it can be used on the `SET` statement to prevent a variable from even entering the Program Data Vector (PDV), saving memory and processing time.
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
About the Author:
SAS Learning Hub is a data analytics and SAS programming platform focused on clinical, financial, and real-world data analysis. The content is created by professionals with academic training in Pharmaceutics and hands-on experience in Base SAS, PROC SQL, Macros, SDTM, and ADaM, providing practical and industry-relevant SAS learning resources.

Disclaimer:
The datasets and analysis in this article are created for educational and demonstration purposes only. They do not represent CARGO SHIPPING data.

Our Mission:
This blog provides industry-focused SAS programming tutorials and analytics projects covering finance, healthcare, and technology.

This project is suitable for:
·  Students learning SAS
·  Data analysts building portfolios
·  Professionals preparing for SAS interviews
·  Bloggers writing about analytics and Exams Reviewers and Observers

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Follow Us On :

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--->Follow our blog for more SAS-based analytics projects and industry data models.

---> Support Us By Following Our Blog..

To deepen your understanding of SAS analytics, please refer to our other data science and industry-focused projects listed below:

1.Can we analyze planets using SAS like NASA scientists?
2.Can SAS Compare How 3 Different Families Live in Their Homes?
3.Can SAS Help Us Choose the Most Efficient Cooler?
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

About Us | Contact | Privacy Policy

Can Advanced SAS Programming Detect, Correct, and Optimize Global Cargo Shipping Route Data While Improving Efficiency and Reducing Costs?

Data-Driven Shipping Excellence: Detecting Errors and Boosting Route Performance with SAS

HERE IN THIS PROJECT WE USED THESE SAS STATEMENTS —DATA | SET | MERGE | INPUT | DATALINES | IF | BY | OUTPUT | PROC SORT | PROC APPEND | PROC TRANSPOSE | PROC DATASETS | RUN | %MACRO | %MEND | FUNCTIONS | R LANGUAGE

1. Introduction

2. Business Context

Explanation (R)

Explanation

1. Fix for AVG Error

Wrong:

Correct:

Why?

2. NOTDIGIT Issue Fix

Original:

Problem:

Fixed:

Why?

3. Efficiency Correction

Why?

Key Learnings

Key Points

DATA Step

SET

IF

PROC SORT

PROC SQL

MACROS

Project Insights

INTERVIEW QUESTIONS FOR YOU

1. The RETAIN Statement

2. FIRST. and LAST. Variables

3. DROP vs. KEEP (Statement vs. Dataset Option)

Follow Us On :

To deepen your understanding of SAS analytics, please refer to our other data science and industry-focused projects listed below:

About Us | Contact | Privacy Policy

Comments

Post a Comment

Popular posts from this blog

Beyond Fabric and Fashion: Turning the World’s Most Beautiful Sarees Dataset into Structured Intelligence with SAS and R

Data Cleaning Secrets Using Famous Food Dataset:Handling Duplicate Records in SAS

Global AI Trends Unlocked Through SCAN and SUBSTR Precision in SAS