150.EXPLORING ONLINE COURSE ENGAGEMENT THROUGH SAS: A COMPREHENSIVE ANALYSIS UTILIZING DATA VISUALIZATION, STATISTICAL PROCEDURES, AND REPORTING TECHNIQUES
- Get link
- X
- Other Apps
EXPLORING ONLINE COURSE ENGAGEMENT THROUGH SAS: A COMPREHENSIVE ANALYSIS UTILIZING DATA VISUALIZATION, STATISTICAL PROCEDURES, AND REPORTING TECHNIQUES
/*Create a unique dataset centered around Online Course Engagement and demonstrate various SAS procedures to analyze and visualize this data.*/
Dataset Overview: Online Course Engagement
We'll simulate a dataset named course_engagement that captures student interactions with an online course platform. The dataset includes:
student_id: Unique identifier for each student
course_id: Identifier for the course
enrollment_date: Date the student enrolled in the course
completion_date: Date the student completed the course (if completed)
time_spent: Total time spent on the course (in hours)
assignments_submitted: Number of assignments submitted
quizzes_attempted: Number of quizzes attempted
final_score: Final score achieved in the course
course_rating: Rating given by the student (1 to 5)
country: Country of the student
Step 1: Data Creation
/*First, we'll create the course_engagement dataset using SAS:*/
data course_engagement;
format enrollment_date completion_date date9.;
do student_id = 1 to 20;
course_id = ceil(ranuni(0)*10);
enrollment_date = '01JAN2025'd + ceil(ranuni(0)*90);
if ranuni(0) < 0.8 then do;
completion_date = enrollment_date + ceil(ranuni(0)*60);
completed = 1;
end;
else do;
completion_date = .;
completed = 0;
end;
time_spent = round(ranuni(0)*50, 0.1);
assignments_submitted = ceil(ranuni(0)*10);
quizzes_attempted = ceil(ranuni(0)*5);
final_score = round(ranuni(0)*100, 0.1);
course_rating = ceil(ranuni(0)*5);
country = scan("USA Canada UK India Australia Germany France Brazil Japan SouthAfrica", ceil(ranuni(0)*10));
output;
end;
run;
proc print;run;
Output:
Obs | enrollment_date | completion_date | student_id | course_id | completed | time_spent | assignments_submitted | quizzes_attempted | final_score | course_rating | country |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 28JAN2025 | 14FEB2025 | 1 | 7 | 1 | 34.7 | 1 | 5 | 72.7 | 4 | Canada |
2 | 11JAN2025 | 21FEB2025 | 2 | 2 | 1 | 17.0 | 4 | 4 | 38.9 | 1 | France |
3 | 19MAR2025 | 15APR2025 | 3 | 5 | 1 | 27.6 | 1 | 5 | 58.4 | 2 | Brazil |
4 | 27MAR2025 | 22MAY2025 | 4 | 5 | 1 | 18.6 | 2 | 3 | 71.6 | 1 | Brazil |
5 | 13JAN2025 | 10FEB2025 | 5 | 7 | 1 | 13.6 | 2 | 4 | 54.2 | 4 | SouthAfrica |
6 | 10JAN2025 | 02FEB2025 | 6 | 6 | 1 | 28.8 | 1 | 2 | 62.0 | 2 | India |
7 | 15JAN2025 | 24JAN2025 | 7 | 6 | 1 | 36.1 | 6 | 4 | 4.9 | 4 | Australia |
8 | 17FEB2025 | 16APR2025 | 8 | 7 | 1 | 9.8 | 2 | 3 | 38.2 | 2 | Brazil |
9 | 02JAN2025 | 25FEB2025 | 9 | 10 | 1 | 32.3 | 2 | 1 | 30.5 | 4 | Japan |
10 | 01FEB2025 | . | 10 | 2 | 0 | 1.4 | 9 | 5 | 32.3 | 5 | France |
11 | 15MAR2025 | 19MAR2025 | 11 | 10 | 1 | 7.1 | 9 | 1 | 76.7 | 3 | Australia |
12 | 13FEB2025 | 18MAR2025 | 12 | 9 | 1 | 47.7 | 4 | 3 | 46.3 | 5 | Germany |
13 | 13FEB2025 | 21MAR2025 | 13 | 7 | 1 | 1.2 | 2 | 2 | 67.7 | 5 | Brazil |
14 | 01FEB2025 | 16MAR2025 | 14 | 5 | 1 | 17.0 | 4 | 2 | 38.6 | 2 | Germany |
15 | 20JAN2025 | 30JAN2025 | 15 | 7 | 1 | 32.5 | 8 | 1 | 20.1 | 2 | Brazil |
16 | 30JAN2025 | 01MAR2025 | 16 | 7 | 1 | 36.1 | 5 | 2 | 83.2 | 5 | Australia |
17 | 17FEB2025 | 14MAR2025 | 17 | 3 | 1 | 8.0 | 4 | 3 | 76.8 | 2 | France |
18 | 25JAN2025 | 13MAR2025 | 18 | 10 | 1 | 43.3 | 1 | 1 | 76.4 | 1 | Germany |
19 | 16JAN2025 | 13MAR2025 | 19 | 10 | 1 | 4.4 | 4 | 3 | 28.6 | 1 | Brazil |
20 | 06FEB2025 | . | 20 | 7 | 0 | 28.6 | 3 | 2 | 11.7 | 3 | USA |
Step 2: Descriptive Statistics with PROC MEANS
/*To understand the central tendencies and dispersion of our numeric variables:*/
proc means data=course_engagement n mean std min max;
var time_spent assignments_submitted quizzes_attempted final_score;
run;
Output:
Variable | N | Mean | Std Dev | Minimum | Maximum | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
/*This provides insights into average time spent, assignment submissions, quiz attempts, and final scores.*/
Step 3: Frequency Analysis with PROC FREQ
/*Analyzing categorical variables:*/
proc freq data=course_engagement;
tables course_rating country completed;
run;
Output:
course_rating | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
1 | 4 | 20.00 | 4 | 20.00 |
2 | 6 | 30.00 | 10 | 50.00 |
3 | 2 | 10.00 | 12 | 60.00 |
4 | 4 | 20.00 | 16 | 80.00 |
5 | 4 | 20.00 | 20 | 100.00 |
country | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
Australia | 3 | 15.00 | 3 | 15.00 |
Brazil | 6 | 30.00 | 9 | 45.00 |
Canada | 1 | 5.00 | 10 | 50.00 |
France | 3 | 15.00 | 13 | 65.00 |
Germany | 3 | 15.00 | 16 | 80.00 |
India | 1 | 5.00 | 17 | 85.00 |
Japan | 1 | 5.00 | 18 | 90.00 |
SouthAfrica | 1 | 5.00 | 19 | 95.00 |
USA | 1 | 5.00 | 20 | 100.00 |
completed | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
0 | 2 | 10.00 | 2 | 10.00 |
1 | 18 | 90.00 | 20 | 100.00 |
/*This reveals the distribution of course ratings, student countries, and completion status.*/
Step 4: Correlation Analysis with PROC CORR
/*Understanding relationships between numeric variables:*/
proc corr data=course_engagement;
var time_spent assignments_submitted quizzes_attempted final_score;
run;
Output:
4 Variables: | time_spent assignments_submitted quizzes_attempted final_score |
---|
Simple Statistics | ||||||
---|---|---|---|---|---|---|
Variable | N | Mean | Std Dev | Sum | Minimum | Maximum |
time_spent | 20 | 22.29000 | 14.28035 | 445.80000 | 1.20000 | 47.70000 |
assignments_submitted | 20 | 3.70000 | 2.57723 | 74.00000 | 1.00000 | 9.00000 |
quizzes_attempted | 20 | 2.80000 | 1.36111 | 56.00000 | 1.00000 | 5.00000 |
final_score | 20 | 49.49000 | 23.73647 | 989.80000 | 4.90000 | 83.20000 |
Pearson Correlation
Coefficients, N = 20 Prob > |r| under H0: Rho=0 | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
time_spent | assignments_submitted | quizzes_attempted | final_score | |||||||||
time_spent |
|
|
|
| ||||||||
assignments_submitted |
|
|
|
| ||||||||
quizzes_attempted |
|
|
|
| ||||||||
final_score |
|
|
|
|
/*This identifies how time spent correlates with performance metrics.*/
Step 5: Regression Analysis with PROC REG
/*Exploring how engagement metrics predict final scores:*/
proc reg data=course_engagement;
model final_score = time_spent assignments_submitted quizzes_attempted;
run;
Output:
4 Variables: | time_spent assignments_submitted quizzes_attempted final_score |
---|
Simple Statistics | ||||||
---|---|---|---|---|---|---|
Variable | N | Mean | Std Dev | Sum | Minimum | Maximum |
time_spent | 20 | 22.29000 | 14.28035 | 445.80000 | 1.20000 | 47.70000 |
assignments_submitted | 20 | 3.70000 | 2.57723 | 74.00000 | 1.00000 | 9.00000 |
quizzes_attempted | 20 | 2.80000 | 1.36111 | 56.00000 | 1.00000 | 5.00000 |
final_score | 20 | 49.49000 | 23.73647 | 989.80000 | 4.90000 | 83.20000 |
Pearson Correlation
Coefficients, N = 20 Prob > |r| under H0: Rho=0 | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
time_spent | assignments_submitted | quizzes_attempted | final_score | |||||||||
time_spent |
|
|
|
| ||||||||
assignments_submitted |
|
|
|
| ||||||||
quizzes_attempted |
|
|
|
| ||||||||
final_score |
|
|
|
|
/*This regression model assesses the impact of engagement on final performance.*/
Step 6: Data Visualization with PROC SGPLOT
/*Visualizing the relationship between time spent and final score:*/
proc sgplot data=course_engagement;
scatter x=time_spent y=final_score / group=completed;
reg x=time_spent y=final_score / group=completed;
xaxis label="Time Spent (hours)";
yaxis label="Final Score";
run;
/*This scatter plot with regression lines illustrates performance differences between completed and non-completed courses.*/
Step 7: Geographic Distribution with PROC GCHART
/*Visualizing student distribution by country:*/
proc gchart data=course_engagement;
vbar country / discrete;
run;
/*This bar chart shows the number of students from each country.*/
Step 8: Box Plot with PROC SGPLOT
/*Analyzing score distribution by course rating:*/
proc sgplot data=course_engagement;
vbox final_score / category=course_rating;
xaxis label="Course Rating";
yaxis label="Final Score";
run;
/*This box plot highlights how student ratings relate to their final scores.*/
Step 9: Panel Plot with PROC SGPANEL
/*Comparing time spent across countries:*/
proc sgpanel data=course_engagement;
panelby country / columns=3;
histogram time_spent;
colaxis label="Time Spent (hours)";
run;
/*This panel plot provides a country-wise distribution of time spent.*/
Step 10: Creating a Summary Report with PROC REPORT
/*Generating a summary table:*/
proc report data=course_engagement nowd;
column country completed n mean_final_score;
define country / group;
define completed / group;
define n / "Number of Students";
define mean_final_score /analysis mean "Average Final Score";
run;
Output:
country | completed | Number of Students | Average Final Score |
---|---|---|---|
Australia | 1 | 3 | 54.933333 |
Brazil | 1 | 6 | 47.433333 |
Canada | 1 | 1 | 72.7 |
France | 0 | 1 | 32.3 |
1 | 2 | 57.85 | |
Germany | 1 | 3 | 53.766667 |
India | 1 | 1 | 62 |
Japan | 1 | 1 | 30.5 |
SouthAfrica | 1 | 1 | 54.2 |
USA | 0 | 1 | 11.7 |
/*This report summarizes the number of students and average scores by country and completion status.*/
Step 11: Exporting Results with PROC EXPORT
/*Exporting the dataset to a CSV file:*/
proc export data=course_engagement
outfile="course_engagement.csv"
dbms=csv
replace;
run;
/*This allows for sharing or further analysis in other tools.*/
- Get link
- X
- Other Apps
Comments
Post a Comment