155.URBAN TRAFFIC PATTERN ANALYSIS USING SAS | PROC PRINT | PROC CONTENTS | PROC MEANS | PROC FREQ | PROC CORR | PROC SORT | PROC SGPLOT | A COMPREHENSIVE GUIDE TO DATA EXPLORATION, TRANSFORMATION, AND VISUALIZATION
- Get link
- X
- Other Apps
URBAN TRAFFIC PATTERN ANALYSIS USING SAS | PROC PRINT | PROC CONTENTS | PROC MEANS | PROC FREQ | PROC CORR | PROC SORT | PROC SGPLOT | A COMPREHENSIVE GUIDE TO DATA EXPLORATION, TRANSFORMATION, AND VISUALIZATION
/*Create a unique dataset centered around urban traffic patterns*/
Dataset Overview: Urban Traffic Patterns
Objective: To analyze traffic congestion patterns across different intersections in a city over a week.
Variables:
Intersection_ID (Character): Unique identifier for each intersection.
Day (Character): Day of the week (e.g., Monday, Tuesday).
Time_Slot (Character): Time interval (e.g., 08:00-09:00).
Vehicle_Count (Numeric): Number of vehicles passing through during the time slot.
Average_Speed (Numeric): Average speed of vehicles (km/h).
Accident_Count (Numeric): Number of accidents reported.
Step 1: Creating the Dataset in SAS
/*We'll simulate data for 5 intersections over 7 days with 3 time slots each day.*/
data Traffic_Data;
length Intersection_ID $10 Day $10 Time_Slot $11;
do i = 1 to 5;
Intersection_ID = cats('INT', put(i, z2.));
do j = 1 to 7;
Day = scan('Monday Tuesday Wednesday Thursday Friday Saturday Sunday', j);
do k = 1 to 3;
select (k);
when (1) Time_Slot = '08:00-09:00';
when (2) Time_Slot = '12:00-13:00';
when (3) Time_Slot = '17:00-18:00';
end;
Vehicle_Count = int(100 + ranuni(0)*400);
Average_Speed = round(30 + ranuni(0)*40, 0.1);
Accident_Count = rand('bernoulli', 0.05);
output;
end;
end;
end;
drop i j k;
run;
proc print data=Traffic_Data (obs=20);
title 'Sample of Traffic Data';
run;
Output: Sample of Traffic Data |
Obs | Intersection_ID | Day | Time_Slot | Vehicle_Count | Average_Speed | Accident_Count |
---|---|---|---|---|---|---|
1 | INT01 | Monday | 08:00-09:00 | 311 | 66.9 | 0 |
2 | INT01 | Monday | 12:00-13:00 | 199 | 62.8 | 0 |
3 | INT01 | Monday | 17:00-18:00 | 173 | 69.5 | 0 |
4 | INT01 | Tuesday | 08:00-09:00 | 163 | 43.4 | 0 |
5 | INT01 | Tuesday | 12:00-13:00 | 182 | 31.8 | 0 |
6 | INT01 | Tuesday | 17:00-18:00 | 265 | 46.9 | 0 |
7 | INT01 | Wednesday | 08:00-09:00 | 315 | 35.1 | 0 |
8 | INT01 | Wednesday | 12:00-13:00 | 410 | 52.5 | 0 |
9 | INT01 | Wednesday | 17:00-18:00 | 482 | 60.3 | 0 |
10 | INT01 | Thursday | 08:00-09:00 | 159 | 68.2 | 0 |
11 | INT01 | Thursday | 12:00-13:00 | 249 | 36.7 | 0 |
12 | INT01 | Thursday | 17:00-18:00 | 252 | 42.8 | 0 |
13 | INT01 | Friday | 08:00-09:00 | 253 | 67.1 | 1 |
14 | INT01 | Friday | 12:00-13:00 | 123 | 50.2 | 0 |
15 | INT01 | Friday | 17:00-18:00 | 294 | 62.9 | 0 |
16 | INT01 | Saturday | 08:00-09:00 | 466 | 56.5 | 0 |
17 | INT01 | Saturday | 12:00-13:00 | 177 | 63.0 | 0 |
18 | INT01 | Saturday | 17:00-18:00 | 446 | 67.2 | 0 |
19 | INT01 | Sunday | 08:00-09:00 | 494 | 32.3 | 0 |
20 | INT01 | Sunday | 12:00-13:00 | 364 | 48.6 | 0 |
Explanation:
Understanding the Nested DO Loops
do i = 1 to 5; /* 5 intersections */
do j = 1 to 7; /* 7 days of the week */
do k = 1 to 3; /* 3 time slots per day */
/* Data generation and output */
end;
end;
end;
Outer Loop (i): Iterates over 5 intersections.
Middle Loop (j): Iterates over 7 days (Monday to Sunday).
Inner Loop (k): Iterates over 3 time slots per day.
Calculating Total Observations
To determine the total number of observations:
Intersections: 5
Days per Intersection: 7
Time Slots per Day: 3
Total Observations = 5 (intersections) × 7 (days) × 3 (time slots) = 105 observations
Step 2: Exploring the Dataset
2.1 Dataset Structure
proc contents data=Traffic_Data;
title 'Structure of Traffic Data';
run;
Output:
Data Set Name | WORK.TRAFFIC_DATA | Observations | 105 |
---|---|---|---|
Member Type | DATA | Variables | 6 |
Engine | V9 | Indexes | 0 |
Created | 14/09/2015 00:19:29 | Observation Length | 56 |
Last Modified | 14/09/2015 00:19:29 | Deleted Observations | 0 |
Protection | Compressed | NO | |
Data Set Type | Sorted | NO | |
Label | |||
Data Representation | WINDOWS_64 | ||
Encoding | wlatin1 Western (Windows) |
Engine/Host Dependent Information | |
---|---|
Data Set Page Size | 65536 |
Number of Data Set Pages | 1 |
First Data Page | 1 |
Max Obs per Page | 1167 |
Obs in First Data Page | 105 |
Number of Data Set Repairs | 0 |
ExtendObsCounter | YES |
Filename | C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD13268_DESKTOP-QFAA4KV_\traffic_data.sas7bdat |
Release Created | 9.0401M2 |
Host Created | X64_8HOME |
Alphabetic List of Variables and Attributes | |||
---|---|---|---|
# | Variable | Type | Len |
6 | Accident_Count | Num | 8 |
5 | Average_Speed | Num | 8 |
2 | Day | Char | 10 |
1 | Intersection_ID | Char | 10 |
3 | Time_Slot | Char | 11 |
4 | Vehicle_Count | Num | 8 |
Step 3: Data Analysis Using SAS Procedures
3.1 Descriptive Statistics
proc means data=Traffic_Data n mean std min max;
var Vehicle_Count Average_Speed Accident_Count;
title 'Descriptive Statistics for Traffic Variables';
run;
Output:
Descriptive Statistics for Traffic Variables |
Variable | N | Mean | Std Dev | Minimum | Maximum | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
3.2 Frequency of Accidents by Day
proc freq data=Traffic_Data;
tables Day*Accident_Count / nocol nopercent;
title 'Frequency of Accidents by Day';
run;
Output:
Frequency of Accidents by Day |
|
|
3.3 Average Vehicle Count by Time Slot
proc means data=Traffic_Data mean;
class Time_Slot;
var Vehicle_Count;
title 'Average Vehicle Count by Time Slot';
run;
Output:
Average Vehicle Count by Time Slot |
Analysis Variable : Vehicle_Count | ||
---|---|---|
Time_Slot | N Obs | Mean |
08:00-09:00 | 35 | 327.5428571 |
12:00-13:00 | 35 | 285.9714286 |
17:00-18:00 | 35 | 290.1714286 |
Step 4: Data Transformation
4.1 Creating a Congestion Level Variable
data Traffic_Data_Transformed;
set Traffic_Data;
length Congestion_Level $6;
if Vehicle_Count > 400 then Congestion_Level = 'High';
else if Vehicle_Count > 200 then Congestion_Level = 'Medium';
else Congestion_Level = 'Low';
run;
proc print;run;
Output:
Obs | Intersection_ID | Day | Time_Slot | Vehicle_Count | Average_Speed | Accident_Count | Congestion_Level |
---|---|---|---|---|---|---|---|
1 | INT01 | Monday | 08:00-09:00 | 311 | 66.9 | 0 | Medium |
2 | INT01 | Monday | 12:00-13:00 | 199 | 62.8 | 0 | Low |
3 | INT01 | Monday | 17:00-18:00 | 173 | 69.5 | 0 | Low |
4 | INT01 | Tuesday | 08:00-09:00 | 163 | 43.4 | 0 | Low |
5 | INT01 | Tuesday | 12:00-13:00 | 182 | 31.8 | 0 | Low |
6 | INT01 | Tuesday | 17:00-18:00 | 265 | 46.9 | 0 | Medium |
7 | INT01 | Wednesday | 08:00-09:00 | 315 | 35.1 | 0 | Medium |
8 | INT01 | Wednesday | 12:00-13:00 | 410 | 52.5 | 0 | High |
9 | INT01 | Wednesday | 17:00-18:00 | 482 | 60.3 | 0 | High |
10 | INT01 | Thursday | 08:00-09:00 | 159 | 68.2 | 0 | Low |
11 | INT01 | Thursday | 12:00-13:00 | 249 | 36.7 | 0 | Medium |
12 | INT01 | Thursday | 17:00-18:00 | 252 | 42.8 | 0 | Medium |
13 | INT01 | Friday | 08:00-09:00 | 253 | 67.1 | 1 | Medium |
14 | INT01 | Friday | 12:00-13:00 | 123 | 50.2 | 0 | Low |
15 | INT01 | Friday | 17:00-18:00 | 294 | 62.9 | 0 | Medium |
16 | INT01 | Saturday | 08:00-09:00 | 466 | 56.5 | 0 | High |
17 | INT01 | Saturday | 12:00-13:00 | 177 | 63.0 | 0 | Low |
18 | INT01 | Saturday | 17:00-18:00 | 446 | 67.2 | 0 | High |
19 | INT01 | Sunday | 08:00-09:00 | 494 | 32.3 | 0 | High |
20 | INT01 | Sunday | 12:00-13:00 | 364 | 48.6 | 0 | Medium |
4.2 Frequency of Congestion Levels
proc freq data=Traffic_Data_Transformed;
tables Congestion_Level;
title 'Frequency of Congestion Levels';
run;
Output:
Frequency of Congestion Levels |
Congestion_Level | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
High | 28 | 26.67 | 28 | 26.67 |
Low | 28 | 26.67 | 56 | 53.33 |
Medium | 49 | 46.67 | 105 | 100.00 |
Step 5: Advanced Analysis
5.1 Correlation Between Vehicle Count and Average Speed
proc corr data=Traffic_Data;
var Vehicle_Count Average_Speed;
title 'Correlation Between Vehicle Count and Average Speed';
run;
Output:
5.2 Identifying Peak Congestion Times
proc sort data=Traffic_Data;
by descending Vehicle_Count;
run;
proc print data=Traffic_Data (obs=20);
title 'Top 20 Peak Congestion Records';
run;
Output:
Top 20 Peak Congestion Records |
Obs | Intersection_ID | Day | Time_Slot | Vehicle_Count | Average_Speed | Accident_Count |
---|---|---|---|---|---|---|
1 | INT05 | Saturday | 17:00-18:00 | 496 | 63.1 | 0 |
2 | INT01 | Sunday | 08:00-09:00 | 494 | 32.3 | 0 |
3 | INT05 | Sunday | 12:00-13:00 | 488 | 62.2 | 0 |
4 | INT05 | Monday | 08:00-09:00 | 484 | 39.9 | 0 |
5 | INT03 | Monday | 12:00-13:00 | 483 | 47.8 | 0 |
6 | INT01 | Wednesday | 17:00-18:00 | 482 | 60.3 | 0 |
7 | INT04 | Monday | 08:00-09:00 | 473 | 62.0 | 0 |
8 | INT04 | Thursday | 08:00-09:00 | 472 | 66.6 | 0 |
9 | INT02 | Thursday | 08:00-09:00 | 470 | 56.3 | 0 |
10 | INT03 | Monday | 08:00-09:00 | 470 | 42.2 | 0 |
11 | INT01 | Saturday | 08:00-09:00 | 466 | 56.5 | 0 |
12 | INT01 | Saturday | 17:00-18:00 | 446 | 67.2 | 0 |
13 | INT02 | Thursday | 17:00-18:00 | 443 | 30.2 | 0 |
14 | INT03 | Wednesday | 08:00-09:00 | 442 | 36.0 | 0 |
15 | INT03 | Friday | 08:00-09:00 | 442 | 53.0 | 0 |
16 | INT04 | Thursday | 12:00-13:00 | 442 | 32.7 | 0 |
17 | INT03 | Wednesday | 12:00-13:00 | 441 | 36.6 | 0 |
18 | INT05 | Monday | 17:00-18:00 | 440 | 37.7 | 0 |
19 | INT04 | Sunday | 17:00-18:00 | 436 | 62.3 | 0 |
20 | INT02 | Tuesday | 08:00-09:00 | 435 | 63.3 | 0 |
Step 6: Data Visualization
6.1 Bar Chart of Average Vehicle Count by Day
proc sgplot data=Traffic_Data;
vbar Day / response=Vehicle_Count stat=mean;
title 'Average Vehicle Count by Day';
run;
6.2 Line Plot of Average Speed Over Time Slots
proc sgplot data=Traffic_Data;
series x=Time_Slot y=Average_Speed / group=Day;
title 'Average Speed Over Time Slots by Day';
run;
- Get link
- X
- Other Apps
Comments
Post a Comment