Sunday, 26 October 2025

293.REALWORLD PLANETS DATASET USING PROC PRINT | PROC MEANS | PROC FREQ | PROC SQL | PROC SORT | PROC REPORT | MACROS WITH QC CHECKS AND DENSITY DERIVATION

REALWORLD PLANETS DATASET USING PROC PRINT | PROC MEANS | PROC FREQ | PROC SQL | PROC SORT | PROC REPORT | MACROS WITH QC CHECKS AND DENSITY DERIVATION

/*CREATING A DATASET OF REALWORLD PLANETS  */

1) Create dataset 

options nocenter;

data work.real_planets;

    length PlanetID $3 Name $20 Type $12 Mass_Earth 8 Radius_km 8 OrbitalPeriod_days 8 

          Moons 8 Has_Rings $1;

    label PlanetID = 'Planet ID'

          Name = 'Planet Name'

          Type = 'Planet Type (Terrestrial/Gas/Ice/Dwarf)'

          Mass_Earth = 'Mass (in Earth masses)'

          Radius_km = 'Radius (km)'

          OrbitalPeriod_days = 'Orbital Period (days)'

          Moons = 'Number of known moons'

          Has_Rings = 'Has Rings (Y/N)';

    input PlanetID $ Name :$20. Type :$12. Mass_Earth Radius_km OrbitalPeriod_days

          Moons Has_Rings :$1.;

    datalines;

P01 Mercury Terrestrial 0.055 2439 88 0 N

P02 Venus Terrestrial 0.815 6052 225 0 N

P03 Earth Terrestrial 1 6371 365.25 1 N

P04 Mars Terrestrial 0.107 3390 687 2 N

P05 Jupiter Gas 317.8 69911 4332 79 Y

P06 Saturn Gas 95.2 58232 10759 83 Y

P07 Uranus Ice 14.5 25362 30685 27 Y

P08 Neptune Ice 17.1 24622 60190 14 Y

P09 Pluto Dwarf 0.0022 1188 90560 5 N

P10 Eris Dwarf 0.0028 1163 203830 1 N

P11 Haumea Dwarf 0.0007 816 103000 2 N

P12 Ceres Dwarf 0.00015 473 1680 0 N

;

run;

proc print;run;

Output:

ObsPlanetIDNameTypeMass_EarthRadius_kmOrbitalPeriod_daysMoonsHas_Rings
1P01MercuryTerrestrial0.055243988.000N
2P02VenusTerrestrial0.8156052225.000N
3P03EarthTerrestrial1.0006371365.251N
4P04MarsTerrestrial0.1073390687.002N
5P05JupiterGas317.800699114332.0079Y
6P06SaturnGas95.2005823210759.0083Y
7P07UranusIce14.5002536230685.0027Y
8P08NeptuneIce17.1002462260190.0014Y
9P09PlutoDwarf0.002118890560.005N
10P10ErisDwarf0.0031163203830.001N
11P11HaumeaDwarf0.001816103000.002N
12P12CeresDwarf0.0004731680.000N


2) Add a derived variable using another macro: density (mass/volume)

Macro: %derive_density(in=, out=)

Purpose: Demonstrates macro parameters and numeric derivation in DATA step.

Note: Mass given in Earth masses; 

     convert Earth mass (5.972e24 kg) and 

     radius in km to compute density g/cm3 roughly.

     This is illustrative and not exact to high precision.


%macro derive_density(in=work.real_planets, out=work.real_planets2);

  data &out.;

    set &in.;

    /* Convert mass in Earth masses to kg and radius in km to meters for volume */

    Mass_kg = Mass_Earth * 5.972e24; /* mass in kg */

    Radius_m = Radius_km * 1000; /* radius in meters */

    Volume_m3 = (4/3) * constant('pi') * (Radius_m**3);

/* Density in kg/m^3, convert to g/cm^3 dividing by 1000 then by 1e6 -> kg/m3 to g/cm3 = /1000*/

    Density_gcm3 = (Mass_kg / Volume_m3) / 1000;

    format Density_gcm3 6.3 Mass_kg comma15.0 Radius_m comma12.0 Volume_m3 comma15.0;

    drop Mass_kg Radius_m Volume_m3;

  run;

  proc print;run;

%mend derive_density;


%derive_density(in=work.real_planets, out=work.real_planets2);

Output:

ObsPlanetIDNameTypeMass_EarthRadius_kmOrbitalPeriod_daysMoonsHas_RingsDensity_gcm3
1P01MercuryTerrestrial0.055243988.000N5.405
2P02VenusTerrestrial0.8156052225.000N5.242
3P03EarthTerrestrial1.0006371365.251N5.513
4P04MarsTerrestrial0.1073390687.002N3.916
5P05JupiterGas317.800699114332.0079Y1.326
6P06SaturnGas95.2005823210759.0083Y0.687
7P07UranusIce14.5002536230685.0027Y1.267
8P08NeptuneIce17.1002462260190.0014Y1.633
9P09PlutoDwarf0.002118890560.005N1.871
10P10ErisDwarf0.0031163203830.001N2.538
11P11HaumeaDwarf0.001816103000.002N1.837
12P12CeresDwarf0.0004731680.000N2.021


3) PROC PRINT 

Purpose: Display the dataset rows for quick inspection.

proc print data=work.real_planets2 label noobs;

  title 'PROC PRINT | Purpose: Display dataset rows for quick inspection';

  var PlanetID Name Type Mass_Earth Radius_km OrbitalPeriod_days Moons Has_Rings Density_gcm3;

run;

Output:

PROC PRINT | Purpose: Display dataset rows for quick inspection

Planet IDPlanet NamePlanet Type (Terrestrial/Gas/Ice/Dwarf)Mass (in Earth masses)Radius (km)Orbital Period (days)Number of known moonsHas Rings (Y/N)Density_gcm3
P01MercuryTerrestrial0.055243988.000N5.405
P02VenusTerrestrial0.8156052225.000N5.242
P03EarthTerrestrial1.0006371365.251N5.513
P04MarsTerrestrial0.1073390687.002N3.916
P05JupiterGas317.800699114332.0079Y1.326
P06SaturnGas95.2005823210759.0083Y0.687
P07UranusIce14.5002536230685.0027Y1.267
P08NeptuneIce17.1002462260190.0014Y1.633
P09PlutoDwarf0.002118890560.005N1.871
P10ErisDwarf0.0031163203830.001N2.538
P11HaumeaDwarf0.001816103000.002N1.837
P12CeresDwarf0.0004731680.000N2.021

4) PROC MEANS 

Purpose: Provide numeric summaries (mean, min, max, std) for continuous variables.

proc means data=work.real_planets2 mean median min max std n nmiss;

  title 'PROC MEANS | Purpose: Provide numeric summaries for continuous variables';

  var Mass_Earth Radius_km OrbitalPeriod_days Moons Density_gcm3;

run;

Output:

PROC MEANS | Purpose: Provide numeric summaries for continuous variables

The MEANS Procedure

VariableLabelMeanMedianMinimumMaximumStd DevNN Miss
Mass_Earth
Radius_km
OrbitalPeriod_days
Moons
Density_gcm3
Mass (in Earth masses)
Radius (km)
Orbital Period (days)
Number of known moons
 
37.2152375
16668.25
42200.10
17.8333333
2.7712886
0.4610000
4721.00
7545.50
2.0000000
1.9457874
0.000150000
473.0000000
88.0000000
0
0.6873580
317.8000000
69911.00
203830.00
83.0000000
5.5132587
92.4143495
23917.15
63021.65
30.5609178
1.7621593
12
12
12
12
12
0
0
0
0
0

5) PROC FREQ 

Purpose: Frequency distribution for categorical variables like Type and Has_Rings.

proc freq data=work.real_planets2;

  title 'PROC FREQ | Purpose: Frequency distribution for categorical variables';

  tables Type Has_Rings / norow nocol nopercent;

run;

Output:

PROC FREQ | Purpose: Frequency distribution for categorical variables

The FREQ Procedure

Planet Type (Terrestrial/Gas/Ice/Dwarf)
TypeFrequencyCumulative
Frequency
Dwarf44
Gas26
Ice28
Terrestrial412
Has Rings (Y/N)
Has_RingsFrequencyCumulative
Frequency
N88
Y412

6) PROC SQL 

Purpose: Use SQL to create aggregated summary tables and perform joins.

proc sql;

  title 'PROC SQL | Purpose: Create aggregated tables using SQL syntax';

  create table work.by_type as

  select Type,

         count(*) as Count_Planets,

         mean(Mass_Earth) as Avg_Mass_Earth format=8.3,

         mean(Density_gcm3) as Avg_Density format=6.3,

         sum(Moons) as Total_Moons

  from work.real_planets2

  group by Type

  order by Count_Planets desc;

quit;


proc print data=work.by_type noobs;

  title 'Summary by Type (created by PROC SQL)';

run;

Output:

Summary by Type (created by PROC SQL)

TypeCount_PlanetsAvg_Mass_EarthAvg_DensityTotal_Moons
Dwarf40.0012.0678
Terrestrial40.4945.0193
Ice215.8001.45041
Gas2206.5001.007162

7) PROC SORT 

Purpose: Sort the dataset for reporting or analysis.

proc sort data=work.real_planets2 out=work.real_planets_sorted;

  by descending Mass_Earth;

run;

proc print;run;

Output:

ObsPlanetIDNameTypeMass_EarthRadius_kmOrbitalPeriod_daysMoonsHas_RingsDensity_gcm3
1P05JupiterGas317.800699114332.0079Y1.326
2P06SaturnGas95.2005823210759.0083Y0.687
3P08NeptuneIce17.1002462260190.0014Y1.633
4P07UranusIce14.5002536230685.0027Y1.267
5P03EarthTerrestrial1.0006371365.251N5.513
6P02VenusTerrestrial0.8156052225.000N5.242
7P04MarsTerrestrial0.1073390687.002N3.916
8P01MercuryTerrestrial0.055243988.000N5.405
9P10ErisDwarf0.0031163203830.001N2.538
10P09PlutoDwarf0.002118890560.005N1.871
11P11HaumeaDwarf0.001816103000.002N1.837
12P12CeresDwarf0.0004731680.000N2.021


8) PROC REPORT 

Purpose: Create a formatted report combining text and numeric columns.

proc report data=work.real_planets_sorted nowd;

  title 'PROC REPORT | Purpose: Create a formatted report combining text and numeric columns';

  columns PlanetID Name Type Mass_Earth Radius_km Density_gcm3 Moons Has_Rings;

  define PlanetID / display 'ID' width=3;

  define Name / display 'Planet' width=15;

  define Type / display 'Type' width=10;

  define Mass_Earth / analysis mean format=8.3 'Mass (Earths)';

  define Radius_km / display format=comma8.0 'Radius (km)';

  define Density_gcm3 / display format=6.3 'Density (g/cm3)';

  define Moons / analysis sum 'Moons';

  define Has_Rings / display 'Rings';

run;

Output:

PROC REPORT | Purpose: Create a formatted report combining text and numeric columns

IDPlanetTypeMass (Earths)Radius (km)Density (g
cm3)
MoonsRings
P05JupiterGas317.80069,9111.32679Y
P06SaturnGas95.20058,2320.68783Y
P08NeptuneIce17.10024,6221.63314Y
P07UranusIce14.50025,3621.26727Y
P03EarthTerrestrial1.0006,3715.5131N
P02VenusTerrestrial0.8156,0525.2420N
P04MarsTerrestrial0.1073,3903.9162N
P01MercuryTerrestrial0.0552,4395.4050N
P10ErisDwarf0.0031,1632.5381N
P09PlutoDwarf0.0021,1881.8715N
P11HaumeaDwarf0.0018161.8372N
P12CeresDwarf0.0004732.0210N

9) Example macro to create a radar of which planets have rings

Macro: %planets_with_rings 

Purpose: Generate a small table of ringed planets

%macro planets_with_rings(in=work.real_planets2, out=work.ringed);

  proc sql;

    create table &out. as

    select PlanetID, Name, Type, Mass_Earth, Moons

    from &in.

    where upcase(Has_Rings) = 'Y'

    order by Mass_Earth desc;

  quit;

%mend planets_with_rings;


%planets_with_rings(in=work.real_planets2, out=work.ringed);


proc print data=work.ringed noobs;

  title 'Planets with Rings (macro-generated)';

run;

Output:

Planets with Rings (macro-generated)

PlanetIDNameTypeMass_EarthMoons
P05JupiterGas317.879
P06SaturnGas95.283
P08NeptuneIce17.114
P07UranusIce14.527

10) Validation / QC checks using PROC SQL and data step

Purpose: Identify missing values, outliers, and logical inconsistencies.

proc sql;

  create table work.qc_checks as

  select PlanetID, Name,

         case when Mass_Earth <= 0 then 'Mass<=0' else '' end as Mass_Issue,

         case when Radius_km <= 0 then 'Radius<=0' else '' end as Radius_Issue,

         case when Missing(Type) then 'TypeMissing' else '' end as Type_Issue

  from work.real_planets2;

quit;


proc print data=work.qc_checks noobs;

  title 'QC Checks - Purpose: Identify missing or invalid numeric/categorical values';

run;

Output:

QC Checks - Purpose: Identify missing or invalid numeric/categorical values

PlanetIDNameMass_IssueRadius_IssueType_Issue
P01Mercury   
P02Venus   
P03Earth   
P04Mars   
P05Jupiter   
P06Saturn   
P07Uranus   
P08Neptune   
P09Pluto   
P10Eris   
P11Haumea   
P12Ceres   




To Visit My Previous Online Streaming Flatform Dataset:Click Here
To Visit My Previous Statistical Evaluation Of Clinical Trials:Click Here
To Visit My Previous Unlocking Retail Insights Dataset:Click Here
To Visit My Previous Sas Interview Questions-1:Click Here



Follow Us On : 


 


--- FOLLOW OUR BLOG FOR MORE INFORMATION.

--->PLEASE DO COMMENTS AND SHARE OUR BLOG.

No comments:

Post a Comment