227.ANALYZING MOST FOLLOWED FACEBOOK PROFILES WORLDWIDE USING PROC PRINT | PROC FORMAT | PROC MEANS | PROC FREQ | PROC SQL | PROC REPORT | PROC SORT | PROC TRANSPOSE | PROC EXPORT | PROC CONTENTS IN SAS
- Get link
- X
- Other Apps
ANALYZING MOST FOLLOWED FACEBOOK PROFILES WORLDWIDE USING PROC PRINT | PROC FORMAT | PROC MEANS | PROC FREQ | PROC SQL | PROC REPORT | PROC SORT | PROC TRANSPOSE | PROC EXPORT | PROC CONTENTS IN SAS
/*Creating a dataset of the world’s most‑followed Facebook profiles*/
1. HOUSE‑KEEPING OPTIONS (good habits!)
options nocenter nodate nonumber
msglevel=i
pagesize=60 linesize=120
validvarname=v7
fmtsearch=(work);
2. GLOBAL MACRO VARIABLES & SYSTEM FUNCTIONS
%let snapshot_dttm = %sysfunc(today(), date9.);
%let delimiter = |;
3. USER‑DEFINED FORMATS (PROC FORMAT)
proc format;
value $continent
'United States','Canada' = 'North America'
'Portugal','United Kingdom',
'Spain','France','Europe' = 'Europe'
'Brazil','Barbados' = 'South America'
'China','Hong Kong' = 'Asia'
'South Korea','Cyprus' = 'Asia'
'Argentina' = 'South America'
other = 'Other';
value followers
0 - <100 = 'Under 100 M'
100 - <125 = '100‑124 M'
125 - <150 = '125‑149 M'
150 - <175 = '150‑174 M'
175 - high = '175 M and above';
run;
4. CORE DATA: 20 ROWS, 6 COLUMNS
data work.fb_followers(label="Top‑20 Facebook profiles (most followers)");
length Page_Name $38 Category $20 Country $20 ;
infile datalines dsd dlm="&delimiter";
input Rank
Page_Name :$38.
Followers_Millions
Category :$20.
Country :$20.;
/*— Derived columns —*/
Followers_Count = Followers_Millions * 1e6; /* raw number */
Continent = put(Country, $continent.); /* via FORMAT */
Follower_Band = put(Followers_Millions, followers.);
Snapshot_Date = "&snapshot_dttm"d; /* DATE value */
/* label + format housekeeping */
label Followers_Millions = 'Followers (Millions)'
Followers_Count = 'Followers (Exact)'
Continent = 'Continent (derived)';
format Followers_Millions 6. Followers_Count comma12. Continent $continent.
Snapshot_Date date9.;
datalines;
1 | Facebook | 188 | Platform | United States
2 | Cristiano Ronaldo | 170 | Footballer | Portugal
3 | Samsung | 162 | Brand | South Korea
4 | Mr. Bean | 140 | Character | United Kingdom
5 | 5-Minute Crafts | 126 | Media | Cyprus
6 | Shakira | 123 | Musician | Colombia
7 | CGTN | 121 | State Media | China
8 | Real Madrid C.F. | 121 | Football Club | Spain
9 | Lionel Messi | 117 | Footballer | Argentina
10| Will Smith | 115 | Actor | United States
11| FC Barcelona | 115 | Football Club | Spain
12| China Daily | 111 | State Media | China
13| Coca-Cola | 109 | Brand | United States
14| YouTube | 108 | Platform | United States
15| Vin Diesel | 105 | Actor | United States
16| Tasty | 104 | Media | United States
17| Rihanna | 104 | Musician | Barbados
18| Xinhua News Agency | 97 | State Media | China
19| Eminem | 94 | Musician | United States
20| Netflix | 94 | Streaming‑Service | United States
;
run;
proc print;run;
Output:
Obs | Page_Name | Category | Country | Rank | Followers_Millions | Followers_Count | Continent | Follower_Band | Snapshot_Date |
---|---|---|---|---|---|---|---|---|---|
1 | Platform | United States | 1 | 188 | 188,000,000 | Other | 175 M and above | 14SEP2015 | |
2 | Cristiano Ronaldo | Footballer | Portugal | 2 | 170 | 170,000,000 | Europe | 150-174 M | 14SEP2015 |
3 | Samsung | Brand | South Korea | 3 | 162 | 162,000,000 | Other | 150-174 M | 14SEP2015 |
4 | Mr. Bean | Character | United Kingdom | 4 | 140 | 140,000,000 | Europe | 125-149 M | 14SEP2015 |
5 | 5-Minute Crafts | Media | Cyprus | 5 | 126 | 126,000,000 | Other | 125-149 M | 14SEP2015 |
6 | Shakira | Musician | Colombia | 6 | 123 | 123,000,000 | Other | 100-124 M | 14SEP2015 |
7 | CGTN | State Media | China | 7 | 121 | 121,000,000 | Other | 100-124 M | 14SEP2015 |
8 | Real Madrid C.F. | Football Club | Spain | 8 | 121 | 121,000,000 | Europe | 100-124 M | 14SEP2015 |
9 | Lionel Messi | Footballer | Argentina | 9 | 117 | 117,000,000 | Other | 100-124 M | 14SEP2015 |
10 | Will Smith | Actor | United States | 10 | 115 | 115,000,000 | Other | 100-124 M | 14SEP2015 |
11 | FC Barcelona | Football Club | Spain | 11 | 115 | 115,000,000 | Europe | 100-124 M | 14SEP2015 |
12 | China Daily | State Media | China | 12 | 111 | 111,000,000 | Other | 100-124 M | 14SEP2015 |
13 | Coca-Cola | Brand | United States | 13 | 109 | 109,000,000 | Other | 100-124 M | 14SEP2015 |
14 | YouTube | Platform | United States | 14 | 108 | 108,000,000 | Other | 100-124 M | 14SEP2015 |
15 | Vin Diesel | Actor | United States | 15 | 105 | 105,000,000 | Other | 100-124 M | 14SEP2015 |
16 | Tasty | Media | United States | 16 | 104 | 104,000,000 | Other | 100-124 M | 14SEP2015 |
17 | Rihanna | Musician | Barbados | 17 | 104 | 104,000,000 | Other | 100-124 M | 14SEP2015 |
18 | Xinhua News Agency | State Media | China | 18 | 97 | 97,000,000 | Other | Under 100 M | 14SEP2015 |
19 | Eminem | Musician | United States | 19 | 94 | 94,000,000 | Other | Under 100 M | 14SEP2015 |
20 | Netflix | Streaming-Service | United States | 20 | 94 | 94,000,000 | Other | Under 100 M | 14SEP2015 |
5. QUICK PEEK (PROC PRINT) – sanity check first 10 rows
title "QC PRINT: First 10 of &SQLOBS rows (snapshot &snapshot_dttm)";
proc print data=work.fb_followers(obs=10);
var Rank Page_Name Followers_Millions Country Continent;
run;
title;
Output:
QC PRINT: First 10 of &SQLOBS rows (snapshot 14SEP2015) |
Obs | Rank | Page_Name | Followers_Millions | Country | Continent |
---|---|---|---|---|---|
1 | 1 | 188 | United States | Other | |
2 | 2 | Cristiano Ronaldo | 170 | Portugal | Europe |
3 | 3 | Samsung | 162 | South Korea | Other |
4 | 4 | Mr. Bean | 140 | United Kingdom | Europe |
5 | 5 | 5-Minute Crafts | 126 | Cyprus | Other |
6 | 6 | Shakira | 123 | Colombia | Other |
7 | 7 | CGTN | 121 | China | Other |
8 | 8 | Real Madrid C.F. | 121 | Spain | Europe |
9 | 9 | Lionel Messi | 117 | Argentina | Other |
10 | 10 | Will Smith | 115 | United States | Other |
6. SUMMARY STATISTICS (PROC MEANS) – follower distribution by band
title 'Descriptive Stats: Follower Counts by Size Band';
proc means data=work.fb_followers n min p25 median mean p75 max maxdec=1;
class Follower_Band;
var Followers_Millions;
run;
title;
Output:
Descriptive Stats: Follower Counts by Size Band |
Analysis Variable : Followers_Millions Followers (Millions) | ||||||||
---|---|---|---|---|---|---|---|---|
Follower_Band | N Obs | N | Minimum | 25th Pctl | Median | Mean | 75th Pctl | Maximum |
100-124 M | 12 | 12 | 104.0 | 106.5 | 113.0 | 112.8 | 119.0 | 123.0 |
125-149 M | 2 | 2 | 126.0 | 126.0 | 133.0 | 133.0 | 140.0 | 140.0 |
150-174 M | 2 | 2 | 162.0 | 162.0 | 166.0 | 166.0 | 170.0 | 170.0 |
175 M and above | 1 | 1 | 188.0 | 188.0 | 188.0 | 188.0 | 188.0 | 188.0 |
Under 100 M | 3 | 3 | 94.0 | 94.0 | 94.0 | 95.0 | 97.0 | 97.0 |
7. FREQUENCY ANALYSIS (PROC FREQ) – categorical splits
title 'Profile Counts by Continent vs. Category';
proc freq data=work.fb_followers order=freq;
tables Continent*Category / nocol nopercent norow;
run;
title;
Output:
Profile Counts by Continent vs. Category |
|
|
8. MACRO #1 : %TOPN – extract an arbitrary “N most followed” list
%macro topn(n=5,out=topn);
%put NOTE‑PGM: Generating table &out with top &n profiles.;
proc sql noprint outobs=&n;;
create table work.&out as
select *
from work.fb_followers
order by Followers_Millions desc;
quit;
proc print;run;
%mend topn;
%topn(n=7,out=top7);
Output:
Obs | Page_Name | Category | Country | Rank | Followers_Millions | Followers_Count | Continent | Follower_Band | Snapshot_Date |
---|---|---|---|---|---|---|---|---|---|
1 | Platform | United States | 1 | 188 | 188,000,000 | Other | 175 M and above | 14SEP2015 | |
2 | Cristiano Ronaldo | Footballer | Portugal | 2 | 170 | 170,000,000 | Europe | 150-174 M | 14SEP2015 |
3 | Samsung | Brand | South Korea | 3 | 162 | 162,000,000 | Other | 150-174 M | 14SEP2015 |
4 | Mr. Bean | Character | United Kingdom | 4 | 140 | 140,000,000 | Europe | 125-149 M | 14SEP2015 |
5 | 5-Minute Crafts | Media | Cyprus | 5 | 126 | 126,000,000 | Other | 125-149 M | 14SEP2015 |
6 | Shakira | Musician | Colombia | 6 | 123 | 123,000,000 | Other | 100-124 M | 14SEP2015 |
7 | CGTN | State Media | China | 7 | 121 | 121,000,000 | Other | 100-124 M | 14SEP2015 |
9. MACRO #2 : %EXPORTCSV – quick CSV dump with current date‑stamp
%macro exportcsv(data=,dir=%sysfunc(pathname(work)));
%local yy mm dd outfile;
%let yy=%sysfunc(year(%sysfunc(today())));
%let mm=%sysfunc(putn(%sysfunc(month(%sysfunc(today()))),z2.));
%let dd=%sysfunc(putn(%sysfunc(day(%sysfunc(today()))),z2.));
%let outfile=&dir./&data._&yy.&mm.&dd..csv;
proc export data=&data
outfile="&outfile."
dbms=csv replace;
run;
%put NOTE‑PGM: &data exported to &outfile ;
%mend exportcsv;
%exportcsv(data=top7)
Log:
10. PROC SQL INSIGHT – average followers by content‑type
title 'Average Followers by High‑Level Category';
proc sql;
select Category
, count(*) as Profiles
, avg(Followers_Millions) format=8.1 as Mean_Followers_M
from work.fb_followers
group by Category
order by Mean_Followers_M desc;
quit;
title;
Output:
Average Followers by High-Level Category |
Category | Profiles | Mean_Followers_M |
---|---|---|
Platform | 2 | 148.0 |
Footballer | 2 | 143.5 |
Character | 1 | 140.0 |
Brand | 2 | 135.5 |
Football Club | 2 | 118.0 |
Media | 2 | 115.0 |
Actor | 2 | 110.0 |
State Media | 3 | 109.7 |
Musician | 3 | 107.0 |
Streaming-Service | 1 | 94.0 |
11. REPORTING LAYER (PROC REPORT)
title "Top 20 Facebook Profiles – Ordered by Followers (Millions)";
proc report data=work.fb_followers nowd split='*';
column Rank Page_Name Followers_Millions Category Country;
define Rank / order 'Rank';
define Page_Name / display 'Page / Person';
define Followers_Millions/ analysis sum 'Followers (M)' format=6.0;
define Category / display 'Type';
define Country / display;
compute after;
line 'Snapshot date: &snapshot_dttm';
endcomp;
run;
title;
Output:
Top 20 Facebook Profiles – Ordered by Followers (Millions) |
Rank | Page / Person | Followers (M) | Type | Country |
---|---|---|---|---|
1 | 188 | Platform | United States | |
2 | Cristiano Ronaldo | 170 | Footballer | Portugal |
3 | Samsung | 162 | Brand | South Korea |
4 | Mr. Bean | 140 | Character | United Kingdom |
5 | 5-Minute Crafts | 126 | Media | Cyprus |
6 | Shakira | 123 | Musician | Colombia |
7 | CGTN | 121 | State Media | China |
8 | Real Madrid C.F. | 121 | Football Club | Spain |
9 | Lionel Messi | 117 | Footballer | Argentina |
10 | Will Smith | 115 | Actor | United States |
11 | FC Barcelona | 115 | Football Club | Spain |
12 | China Daily | 111 | State Media | China |
13 | Coca-Cola | 109 | Brand | United States |
14 | YouTube | 108 | Platform | United States |
15 | Vin Diesel | 105 | Actor | United States |
16 | Tasty | 104 | Media | United States |
17 | Rihanna | 104 | Musician | Barbados |
18 | Xinhua News Agency | 97 | State Media | China |
19 | Eminem | 94 | Musician | United States |
20 | Netflix | 94 | Streaming-Service | United States |
Snapshot date: &snapshot_dttm |
12. OPTIONAL: TRANSPOSE small subset for wide‑format presentation
proc transpose data=top7 out=wide_top7 prefix=Rank_;
id Rank;
var Followers_Millions;
run;
proc print;run;
Output:
Obs | _NAME_ | _LABEL_ | Rank_1 | Rank_2 | Rank_3 | Rank_4 | Rank_5 | Rank_6 | Rank_7 |
---|---|---|---|---|---|---|---|---|---|
1 | Followers_Millions | Followers (Millions) | 188 | 170 | 162 | 140 | 126 | 123 | 121 |
13. SORT + CONTENTS – classic housekeeping
proc sort data=work.fb_followers out=fb_sorted;
by descending Followers_Millions;
run;
proc contents data=fb_sorted varnum short; run;
Output:
Variables in Creation Order |
---|
Page_Name Category Country Rank Followers_Millions Followers_Count Continent Follower_Band Snapshot_Date |
SortedBy |
---|
DESCENDING Followers_Millions |
- Get link
- X
- Other Apps
Comments
Post a Comment