372.CHEMICAL ELEMENTS ANALYSIS USING SAS

CHEMICAL ELEMENTS ANALYSIS USING SAS


HERE IN THIS PROJECT WE USED THESE SAS STATEMENTS --DATA STEP | PROC SQL | PROC MEANS  | PROC UNIVARIATE | MACROS | DATE FUNCTIONS (MDY-INTCK-INTNX) | MERGE | SET | APPEND | TRANSPOSE

1. Project Overview

Chemical elements form the foundation of chemistry, materials science, pharmaceuticals, metallurgy, and electronics. Each element has unique physical and chemical properties such as atomic number, atomic weight, melting point, and industrial or medical usage.

In this project, we create a custom chemical elements dataset and analyze it using SAS programming techniques, focusing on:

·       Data creation and structuring

·       Date handling and time intelligence

·       Statistical analysis

·       Classification using macros

·       Dataset transformation and reshaping

·       Real-world reporting readiness

This project is intentionally designed to simulate enterprise-level SAS work, not just academic examples.


2. Dataset Design

2.1 Variables Included

Variable Name

Description

Element_Name

Name of the chemical element

Atomic_Number

Number of protons

Weight

Atomic weight

Melting_Point

Melting point in °C

Usage

Primary industrial / scientific usage

Discovery_Date

Date the element was discovered


3. Creating the Base Dataset (DATA Step)

data elements_raw;

    length Element_Name $15 Usage $40;

    format Discovery_Date date9.;

    input Element_Name $ Atomic_Number Weight Melting_Point Usage $

          Discovery_Date :date9.;

datalines;

Hydrogen 1 1.008 -259 Fuel 01JAN1766

Helium 2 4.0026 -272 Cryogenics 01JAN1895

Lithium 3 6.94 180 Batteries 01JAN1817

Carbon 6 12.011 3550 OrganicChem 01JAN0000

Nitrogen 7 14.007 -210 Fertilizers 01JAN1772

Oxygen 8 15.999 -218 MedicalUse 01JAN1774

Sodium 11 22.99 98 SaltProduction 01JAN1807

Magnesium 12 24.305 650 Alloys 01JAN1755

Aluminum 13 26.98 660 Construction 01JAN1825

Silicon 14 28.085 1414 Electronics 01JAN1824

Phosphorus 15 30.974 44 Agriculture 01JAN1669

Sulfur 16 32.06 115 Chemicals 01JAN1777

Chlorine 17 35.45 -101 WaterTreatment 01JAN1774

Iron 26 55.845 1538 Steel 01JAN0000

Copper 29 63.546 1085 Wiring 01JAN0000

Zinc 30 65.38 420 Galvanization 01JAN1746

Silver 47 107.8682 962 Jewelry 01JAN0000

Gold 79 196.97 1064 Investment 01JAN0000

;

run;

proc print data=elements_raw;

run;

OUTPUT:

ObsElement_NameUsageDiscovery_DateAtomic_NumberWeightMelting_Point
1HydrogenFuel01JAN176611.008-259
2HeliumCryogenics01JAN189524.003-272
3LithiumBatteries01JAN181736.940180
4CarbonOrganicChem01JAN2000612.0113550
5NitrogenFertilizers01JAN1772714.007-210
6OxygenMedicalUse01JAN1774815.999-218
7SodiumSaltProduction01JAN18071122.99098
8MagnesiumAlloys01JAN17551224.305650
9AluminumConstruction01JAN18251326.980660
10SiliconElectronics01JAN18241428.0851414
11PhosphorusAgriculture01JAN16691530.97444
12SulfurChemicals01JAN17771632.060115
13ChlorineWaterTreatment01JAN17741735.450-101
14IronSteel01JAN20002655.8451538
15CopperWiring01JAN20002963.5461085
16ZincGalvanization01JAN17463065.380420
17SilverJewelry01JAN200047107.868962
18GoldInvestment01JAN200079196.9701064

Key Concepts Learned

·       Variable length control

·       Informats and formats

·       Date literal handling

·       Realistic domain data modeling


4. Date Intelligence and Derived Variables

4.1 Adding Study Dates Using MDY, INTNX, INTCK

data elements_dates;

    set elements_raw;


    Reference_Date = mdy(1,1,2025);

    Years_Since_Discovery = intck('year', Discovery_Date, Reference_Date);

    Next_Review_Date = intnx('year', Discovery_Date, 300, 'same');


    format Reference_Date Next_Review_Date date9.;

run;

proc print data=elements_dates;

run;

OUTPUT:

ObsElement_NameUsageDiscovery_DateAtomic_NumberWeightMelting_PointReference_DateYears_Since_DiscoveryNext_Review_Date
1HydrogenFuel01JAN176611.008-25901JAN202525901JAN2066
2HeliumCryogenics01JAN189524.003-27201JAN202513001JAN2195
3LithiumBatteries01JAN181736.94018001JAN202520801JAN2117
4CarbonOrganicChem01JAN2000612.011355001JAN20252501JAN2300
5NitrogenFertilizers01JAN1772714.007-21001JAN202525301JAN2072
6OxygenMedicalUse01JAN1774815.999-21801JAN202525101JAN2074
7SodiumSaltProduction01JAN18071122.9909801JAN202521801JAN2107
8MagnesiumAlloys01JAN17551224.30565001JAN202527001JAN2055
9AluminumConstruction01JAN18251326.98066001JAN202520001JAN2125
10SiliconElectronics01JAN18241428.085141401JAN202520101JAN2124
11PhosphorusAgriculture01JAN16691530.9744401JAN202535601JAN1969
12SulfurChemicals01JAN17771632.06011501JAN202524801JAN2077
13ChlorineWaterTreatment01JAN17741735.450-10101JAN202525101JAN2074
14IronSteel01JAN20002655.845153801JAN20252501JAN2300
15CopperWiring01JAN20002963.546108501JAN20252501JAN2300
16ZincGalvanization01JAN17463065.38042001JAN202527901JAN2046
17SilverJewelry01JAN200047107.86896201JAN20252501JAN2300
18GoldInvestment01JAN200079196.970106401JAN20252501JAN2300

Explanation

·       MDY() constructs dates explicitly

·       INTCK() calculates elapsed time

·       INTNX() projects future dates

·       These functions are heavily tested in interviews


5. PROC SQL – Data Querying & Business Logic

5.1 Creating a Filtered Table

proc sql;

    create table high_temp_elements as

    select Element_Name,Atomic_Number,Melting_Point,Usage

    from elements_dates

    where Melting_Point > 1000;

quit;

proc print data=high_temp_elements;

run;

OUTPUT:

ObsElement_NameAtomic_NumberMelting_PointUsage
1Carbon63550OrganicChem
2Silicon141414Electronics
3Iron261538Steel
4Copper291085Wiring
5Gold791064Investment

What This Demonstrates

·       SQL-style filtering

·       Dataset creation

·       Conditional logic

·       Industry relevance (high-temperature materials)


6. Statistical Analysis

6.1 PROC MEANS

proc means data=elements_dates mean min max std;

    var Weight Melting_Point Years_Since_Discovery;

run;

OUTPUT:

The MEANS Procedure

VariableMeanMinimumMaximumStd Dev
Weight
Melting_Point
Years_Since_Discovery
41.3567111
595.5555556
180.5000000
1.0080000
-272.0000000
25.0000000
196.9700000
3550.00
356.0000000
47.0411336
944.0932296
108.6934437

Why PROC MEANS Matters

·       Central tendency

·       Variability analysis

·       Mandatory for clinical & industrial analytics


6.2 PROC UNIVARIATE

proc univariate data=elements_dates;

    var Melting_Point;

    histogram Melting_Point;

run;

OUTPUT:

The UNIVARIATE Procedure

Variable: Melting_Point

Moments
N18Sum Weights18
Mean595.555556Sum Observations10720
Std Deviation944.09323Variance891312.026
Skewness1.91009777Kurtosis4.82690524
Uncorrected SS21536660Corrected SS15152304.4
Coeff Variation158.523117Std Error Mean222.524908
Basic Statistical Measures
LocationVariability
Mean595.5556Std Deviation944.09323
Median300.0000Variance891312
Mode.Range3822
  Interquartile Range1165
Tests for Location: Mu0=0
TestStatisticp Value
Student's tt2.676355Pr > |t|0.0159
SignM4Pr >= |M|0.0963
Signed RankS52.5Pr >= |S|0.0208
Quantiles (Definition 5)
LevelQuantile
100% Max3550
99%3550
95%3550
90%1538
75% Q31064
50% Median300
25% Q1-101
10%-259
5%-272
1%-272
0% Min-272
Extreme Observations
LowestHighest
ValueObsValueObs
-2722106418
-2591108515
-2186141410
-2105153814
-1011335504

The UNIVARIATE Procedure

Histogram for Melting_Point

What This Adds

·       Distribution understanding

·       Outlier detection

·       Regulatory-grade statistics


7. Macro Programming – Classification Logic

7.1 Macro to Classify Elements by Melting Point

%macro classify_element;

    data classified_elements;

        set elements_dates;


        if Melting_Point < 0 then Category = "Gas or Cryogenic";

        else if Melting_Point < 500 then Category = "Low Melting";

        else if Melting_Point < 1000 then Category = "Medium Melting";

        else Category = "High Melting";

    run;

proc print data=classified_elements;

run;

%mend;


%classify_element;

OUTPUT:

ObsElement_NameUsageDiscovery_DateAtomic_NumberWeightMelting_PointReference_DateYears_Since_DiscoveryNext_Review_DateCategory
1HydrogenFuel01JAN176611.008-25901JAN202525901JAN2066Gas or Cryogenic
2HeliumCryogenics01JAN189524.003-27201JAN202513001JAN2195Gas or Cryogenic
3LithiumBatteries01JAN181736.94018001JAN202520801JAN2117Low Melting
4CarbonOrganicChem01JAN2000612.011355001JAN20252501JAN2300High Melting
5NitrogenFertilizers01JAN1772714.007-21001JAN202525301JAN2072Gas or Cryogenic
6OxygenMedicalUse01JAN1774815.999-21801JAN202525101JAN2074Gas or Cryogenic
7SodiumSaltProduction01JAN18071122.9909801JAN202521801JAN2107Low Melting
8MagnesiumAlloys01JAN17551224.30565001JAN202527001JAN2055Medium Melting
9AluminumConstruction01JAN18251326.98066001JAN202520001JAN2125Medium Melting
10SiliconElectronics01JAN18241428.085141401JAN202520101JAN2124High Melting
11PhosphorusAgriculture01JAN16691530.9744401JAN202535601JAN1969Low Melting
12SulfurChemicals01JAN17771632.06011501JAN202524801JAN2077Low Melting
13ChlorineWaterTreatment01JAN17741735.450-10101JAN202525101JAN2074Gas or Cryogenic
14IronSteel01JAN20002655.845153801JAN20252501JAN2300High Melting
15CopperWiring01JAN20002963.546108501JAN20252501JAN2300High Melting
16ZincGalvanization01JAN17463065.38042001JAN202527901JAN2046Low Melting
17SilverJewelry01JAN200047107.86896201JAN20252501JAN2300Medium Melting
18GoldInvestment01JAN200079196.970106401JAN20252501JAN2300High Melting

Why Macros Are Critical

·       Reusability

·       Automation

·       Interview favorite topic

·       Production efficiency


8. Dataset Combination Techniques

8.1 SET Statement

data combined_set;

    set elements_raw 

        high_temp_elements;

run;

proc print data=combined_set;

run;

OUTPUT:

ObsElement_NameUsageDiscovery_DateAtomic_NumberWeightMelting_Point
1HydrogenFuel01JAN176611.008-259
2HeliumCryogenics01JAN189524.003-272
3LithiumBatteries01JAN181736.940180
4CarbonOrganicChem01JAN2000612.0113550
5NitrogenFertilizers01JAN1772714.007-210
6OxygenMedicalUse01JAN1774815.999-218
7SodiumSaltProduction01JAN18071122.99098
8MagnesiumAlloys01JAN17551224.305650
9AluminumConstruction01JAN18251326.980660
10SiliconElectronics01JAN18241428.0851414
11PhosphorusAgriculture01JAN16691530.97444
12SulfurChemicals01JAN17771632.060115
13ChlorineWaterTreatment01JAN17741735.450-101
14IronSteel01JAN20002655.8451538
15CopperWiring01JAN20002963.5461085
16ZincGalvanization01JAN17463065.380420
17SilverJewelry01JAN200047107.868962
18GoldInvestment01JAN200079196.9701064
19CarbonOrganicChem.6.3550
20SiliconElectronics.14.1414
21IronSteel.26.1538
22CopperWiring.29.1085
23GoldInvestment.79.1064

Interview Insight

Difference between SET and MERGE is a classic SAS interview question.


8.2 APPEND

proc append base=elements_raw

            data=high_temp_elements force;

run;

proc print data=elements_raw;

run;

OUTPUT:

ObsElement_NameUsageDiscovery_DateAtomic_NumberWeightMelting_Point
1HydrogenFuel01JAN176611.008-259
2HeliumCryogenics01JAN189524.003-272
3LithiumBatteries01JAN181736.940180
4CarbonOrganicChem01JAN2000612.0113550
5NitrogenFertilizers01JAN1772714.007-210
6OxygenMedicalUse01JAN1774815.999-218
7SodiumSaltProduction01JAN18071122.99098
8MagnesiumAlloys01JAN17551224.305650
9AluminumConstruction01JAN18251326.980660
10SiliconElectronics01JAN18241428.0851414
11PhosphorusAgriculture01JAN16691530.97444
12SulfurChemicals01JAN17771632.060115
13ChlorineWaterTreatment01JAN17741735.450-101
14IronSteel01JAN20002655.8451538
15CopperWiring01JAN20002963.5461085
16ZincGalvanization01JAN17463065.380420
17SilverJewelry01JAN200047107.868962
18GoldInvestment01JAN200079196.9701064
19CarbonOrganicChem.6.3550
20SiliconElectronics.14.1414
21IronSteel.26.1538
22CopperWiring.29.1085
23GoldInvestment.79.1064


8.3 MERGE

proc sort data=elements_raw; by Element_Name; run;

proc print data=elements_raw;

run;

OUTPUT:

ObsElement_NameUsageDiscovery_DateAtomic_NumberWeightMelting_Point
1AluminumConstruction01JAN18251326.980660
2CarbonOrganicChem01JAN2000612.0113550
3CarbonOrganicChem.6.3550
4ChlorineWaterTreatment01JAN17741735.450-101
5CopperWiring01JAN20002963.5461085
6CopperWiring.29.1085
7GoldInvestment01JAN200079196.9701064
8GoldInvestment.79.1064
9HeliumCryogenics01JAN189524.003-272
10HydrogenFuel01JAN176611.008-259
11IronSteel01JAN20002655.8451538
12IronSteel.26.1538
13LithiumBatteries01JAN181736.940180
14MagnesiumAlloys01JAN17551224.305650
15NitrogenFertilizers01JAN1772714.007-210
16OxygenMedicalUse01JAN1774815.999-218
17PhosphorusAgriculture01JAN16691530.97444
18SiliconElectronics01JAN18241428.0851414
19SiliconElectronics.14.1414
20SilverJewelry01JAN200047107.868962
21SodiumSaltProduction01JAN18071122.99098
22SulfurChemicals01JAN17771632.060115
23ZincGalvanization01JAN17463065.380420


proc sort data=classified_elements; by Element_Name; run;

proc print data=classified_elements;

run;

OUTPUT:

ObsElement_NameUsageDiscovery_DateAtomic_NumberWeightMelting_PointReference_DateYears_Since_DiscoveryNext_Review_DateCategory
1AluminumConstruction01JAN18251326.98066001JAN202520001JAN2125Medium Melting
2CarbonOrganicChem01JAN2000612.011355001JAN20252501JAN2300High Melting
3ChlorineWaterTreatment01JAN17741735.450-10101JAN202525101JAN2074Gas or Cryogenic
4CopperWiring01JAN20002963.546108501JAN20252501JAN2300High Melting
5GoldInvestment01JAN200079196.970106401JAN20252501JAN2300High Melting
6HeliumCryogenics01JAN189524.003-27201JAN202513001JAN2195Gas or Cryogenic
7HydrogenFuel01JAN176611.008-25901JAN202525901JAN2066Gas or Cryogenic
8IronSteel01JAN20002655.845153801JAN20252501JAN2300High Melting
9LithiumBatteries01JAN181736.94018001JAN202520801JAN2117Low Melting
10MagnesiumAlloys01JAN17551224.30565001JAN202527001JAN2055Medium Melting
11NitrogenFertilizers01JAN1772714.007-21001JAN202525301JAN2072Gas or Cryogenic
12OxygenMedicalUse01JAN1774815.999-21801JAN202525101JAN2074Gas or Cryogenic
13PhosphorusAgriculture01JAN16691530.9744401JAN202535601JAN1969Low Melting
14SiliconElectronics01JAN18241428.085141401JAN202520101JAN2124High Melting
15SilverJewelry01JAN200047107.86896201JAN20252501JAN2300Medium Melting
16SodiumSaltProduction01JAN18071122.9909801JAN202521801JAN2107Low Melting
17SulfurChemicals01JAN17771632.06011501JAN202524801JAN2077Low Melting
18ZincGalvanization01JAN17463065.38042001JAN202527901JAN2046Low Melting


data merged_elements;

    merge elements_raw classified_elements;

    by Element_Name;

run;

proc print data=merged_elements;

run;

OUTPUT:

ObsElement_NameUsageDiscovery_DateAtomic_NumberWeightMelting_PointReference_DateYears_Since_DiscoveryNext_Review_DateCategory
1AluminumConstruction01JAN18251326.98066001JAN202520001JAN2125Medium Melting
2CarbonOrganicChem01JAN2000612.011355001JAN20252501JAN2300High Melting
3CarbonOrganicChem.6.355001JAN20252501JAN2300High Melting
4ChlorineWaterTreatment01JAN17741735.450-10101JAN202525101JAN2074Gas or Cryogenic
5CopperWiring01JAN20002963.546108501JAN20252501JAN2300High Melting
6CopperWiring.29.108501JAN20252501JAN2300High Melting
7GoldInvestment01JAN200079196.970106401JAN20252501JAN2300High Melting
8GoldInvestment.79.106401JAN20252501JAN2300High Melting
9HeliumCryogenics01JAN189524.003-27201JAN202513001JAN2195Gas or Cryogenic
10HydrogenFuel01JAN176611.008-25901JAN202525901JAN2066Gas or Cryogenic
11IronSteel01JAN20002655.845153801JAN20252501JAN2300High Melting
12IronSteel.26.153801JAN20252501JAN2300High Melting
13LithiumBatteries01JAN181736.94018001JAN202520801JAN2117Low Melting
14MagnesiumAlloys01JAN17551224.30565001JAN202527001JAN2055Medium Melting
15NitrogenFertilizers01JAN1772714.007-21001JAN202525301JAN2072Gas or Cryogenic
16OxygenMedicalUse01JAN1774815.999-21801JAN202525101JAN2074Gas or Cryogenic
17PhosphorusAgriculture01JAN16691530.9744401JAN202535601JAN1969Low Melting
18SiliconElectronics01JAN18241428.085141401JAN202520101JAN2124High Melting
19SiliconElectronics.14.141401JAN202520101JAN2124High Melting
20SilverJewelry01JAN200047107.86896201JAN20252501JAN2300Medium Melting
21SodiumSaltProduction01JAN18071122.9909801JAN202521801JAN2107Low Melting
22SulfurChemicals01JAN17771632.06011501JAN202524801JAN2077Low Melting
23ZincGalvanization01JAN17463065.38042001JAN202527901JAN2046Low Melting


9. PROC TRANSPOSE – Reshaping Data

proc transpose data=elements_dates

               out=elements_transposed;

    by Element_Name NotSorted;

    var Weight Melting_Point;

run;

proc print data=elements_transposed;

run;

OUTPUT:

ObsElement_Name_NAME_COL1
1HydrogenWeight1.01
2HydrogenMelting_Point-259.00
3HeliumWeight4.00
4HeliumMelting_Point-272.00
5LithiumWeight6.94
6LithiumMelting_Point180.00
7CarbonWeight12.01
8CarbonMelting_Point3550.00
9NitrogenWeight14.01
10NitrogenMelting_Point-210.00
11OxygenWeight16.00
12OxygenMelting_Point-218.00
13SodiumWeight22.99
14SodiumMelting_Point98.00
15MagnesiumWeight24.31
16MagnesiumMelting_Point650.00
17AluminumWeight26.98
18AluminumMelting_Point660.00
19SiliconWeight28.09
20SiliconMelting_Point1414.00
21PhosphorusWeight30.97
22PhosphorusMelting_Point44.00
23SulfurWeight32.06
24SulfurMelting_Point115.00
25ChlorineWeight35.45
26ChlorineMelting_Point-101.00
27IronWeight55.85
28IronMelting_Point1538.00
29CopperWeight63.55
30CopperMelting_Point1085.00
31ZincWeight65.38
32ZincMelting_Point420.00
33SilverWeight107.87
34SilverMelting_Point962.00
35GoldWeight196.97
36GoldMelting_Point1064.00

Why This Is Important

·       Reporting formats

·       TLF creation

·       SDTM / ADaM reshaping logic


10. Additional SAS Statements Used

Statement

Purpose

DATA

Dataset creation

SET

Row-wise concatenation

MERGE

Column-wise joining

APPEND

Efficient dataset expansion

FORMAT

Presentation control

INPUT

Structured data reading

WHERE

Conditional filtering

 

11.Business Interpretation 

High Melting Elements

·       Iron, Silicon, Gold

·       Used in construction, electronics, and finance

Low Melting / Cryogenic Elements

·       Helium, Hydrogen

·       Used in aerospace and medicine

 

 

12. What You Learn From This Project

Technical Skills

·       Base SAS mastery

·       PROC SQL proficiency

·       Macro automation

·       Date intelligence

Analytical Thinking

·       Classification logic

·       Statistical interpretation

·       Real-world data modeling

Interview Readiness

·       End-to-end SAS workflow

·       Dataset transformation

·       Production-level coding standards

13. How This Helps Your Career

This project can be confidently used as:

·       Interview explanation project

·       Portfolio project

·       Blog or tutorial content

·       Training demonstration

Especially valuable for roles like:

·       SAS Programmer

·       Clinical Data Analyst

·       Data Scientist (SAS track)

·       Statistical Programmer

14. Conclusion

This Chemical Elements SAS Project demonstrates how raw scientific data can be transformed into actionable insights using professional SAS programming practices.

It mirrors real enterprise workflows, covers most frequently tested SAS topics, and proves your capability to handle complex datasets with confidence.



About the Author:

SAS Learning Hub is a data analytics and SAS programming platform focused on clinical, financial, and real-world data analysis. The content is created by professionals with academic training in Pharmaceutics and hands-on experience in Base SAS, PROC SQL, Macros, SDTM, and ADaM, providing practical and industry-relevant SAS learning resources.


Disclaimer:

The datasets and analysis in this article are created for educational and demonstration purposes only. They do not represent chemical elements data.


Our Mission:

This blog provides industry-focused SAS programming tutorials and analytics projects covering finance, healthcare, and technology.


This project is suitable for:

SAS Programmer Interviews

SAS Programmer Job Seekers

SAS Analysts


Follow Us On : 


 


--->Follow our blog for more SAS-based analytics projects and industry data models.


To deepen your understanding of SAS analytics, please refer to our other data science and industry-focused projects listed below:








Comments

Popular posts from this blog

184.HOW TO CREATE MOCK SHELLS AND CLINICAL DATA LISTINGS IN SAS USING DATA NULL | PROC PRINT | PROC REPORT | PROC SORT | PROC COMPARE | ODS PDF | ODS RTF | ODS EXCEL | A COMPLETE STEP-BY-STEP GUIDE FOR CLINICAL SAS PROGRAMMERS

62.ADDING NEW DATA INTO AVAILABLE DATA USING MERGE

341.FAMOUS SCIENTISTS DATA ANALYSIS USING PROC SQL | PROC MEANS | PROC RANK | PROC FORMAT | PROC SGPLOT | MACROS | DATE FUNCTIONS FOR IMPACT EVALUATION | RANKING INSIGHTS | TIME-BASED SCIENTIFIC CONTRIBUTION STUDY