FAMOUS SCIENTISTS DATA ANALYSIS USING PROC SQL | PROC MEANS | PROC RANK | PROC FORMAT | PROC SGPLOT | MACROS | DATE FUNCTIONS FOR IMPACT EVALUATION | RANKING INSIGHTS | TIME-BASED SCIENTIFIC CONTRIBUTION STUDY
options nocenter;
1.DATASET CREATION
data scientists;
length Scientist_Name $30 Country $20 Field $25 Awards $40 Inventions $50 Birth_Date $10;
format Discovery_Date date9.;
input Scientist_Name $ Country $ Field $ Awards $ Inventions $ Impact_Score
Birth_Date $ Discovery_Date :date9.;
datalines;
Einstein Germany Physics Nobel_Prize Relativity 98 14MAR1879 01JAN1905
Newton UK Physics None Laws_of_Motion 97 04JAN1643 01JAN1687
Curie Poland Chemistry Nobel_Prize Radioactivity 96 07NOV1867 01JAN1898
Tesla Serbia Engineering None AC_Current 95 10JUL1856 01JAN1887
Darwin UK Biology None Evolution 94 12FEB1809 01JAN1859
Galileo Italy Astronomy None Telescope 93 15FEB1564 01JAN1610
Edison USA Engineering None Light_Bulb 92 11FEB1847 01JAN1879
Faraday UK Physics None Electromagnetism 91 22SEP1791 01JAN1831
Bohr Denmark Physics Nobel_Prize Atomic_Model 90 07OCT1885 01JAN1913
Feynman USA Physics Nobel_Prize Quantum_Theory 89 11MAY1918 01JAN1948
Hawking UK Physics None Black_Holes 88 08JAN1942 01JAN1974
Raman India Physics Nobel_Prize Raman_Effect 87 07NOV1888 28FEB1928
;
run;
proc print data=scientists;
run;
OUTPUT:
| Obs | Scientist_Name | Country | Field | Awards | Inventions | Birth_Date | Discovery_Date | Impact_Score |
|---|---|---|---|---|---|---|---|---|
| 1 | Einstein | Germany | Physics | Nobel_Prize | Relativity | 14MAR1879 | 01JAN1905 | 98 |
| 2 | Newton | UK | Physics | None | Laws_of_Motion | 04JAN1643 | 01JAN1687 | 97 |
| 3 | Curie | Poland | Chemistry | Nobel_Prize | Radioactivity | 07NOV1867 | 01JAN1898 | 96 |
| 4 | Tesla | Serbia | Engineering | None | AC_Current | 10JUL1856 | 01JAN1887 | 95 |
| 5 | Darwin | UK | Biology | None | Evolution | 12FEB1809 | 01JAN1859 | 94 |
| 6 | Galileo | Italy | Astronomy | None | Telescope | 15FEB1564 | 01JAN1610 | 93 |
| 7 | Edison | USA | Engineering | None | Light_Bulb | 11FEB1847 | 01JAN1879 | 92 |
| 8 | Faraday | UK | Physics | None | Electromagnetism | 22SEP1791 | 01JAN1831 | 91 |
| 9 | Bohr | Denmark | Physics | Nobel_Prize | Atomic_Model | 07OCT1885 | 01JAN1913 | 90 |
| 10 | Feynman | USA | Physics | Nobel_Prize | Quantum_Theory | 11MAY1918 | 01JAN1948 | 89 |
| 11 | Hawking | UK | Physics | None | Black_Holes | 08JAN1942 | 01JAN1974 | 88 |
| 12 | Raman | India | Physics | Nobel_Prize | Raman_Effect | 07NOV1888 | 28FEB1928 | 87 |
2.DATA REVIEW AND STRUCTURE CHECK
proc contents data=scientists;
run;
OUTPUT:
The CONTENTS Procedure
| Data Set Name | WORK.SCIENTISTS | Observations | 12 |
|---|---|---|---|
| Member Type | DATA | Variables | 8 |
| Engine | V9 | Indexes | 0 |
| Created | 12/17/2025 07:16:14 | Observation Length | 192 |
| Last Modified | 12/17/2025 07:16:14 | Deleted Observations | 0 |
| Protection | Compressed | NO | |
| Data Set Type | Sorted | NO | |
| Label | |||
| Data Representation | SOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LINUX_IA64 | ||
| Encoding | utf-8 Unicode (UTF-8) |
| Engine/Host Dependent Information | |
|---|---|
| Data Set Page Size | 131072 |
| Number of Data Set Pages | 1 |
| First Data Page | 1 |
| Max Obs per Page | 682 |
| Obs in First Data Page | 12 |
| Number of Data Set Repairs | 0 |
| Filename | /saswork/SAS_work4389000164D4_odaws02-apse1-2.oda.sas.com/SAS_work894F000164D4_odaws02-apse1-2.oda.sas.com/scientists.sas7bdat |
| Release Created | 9.0401M8 |
| Host Created | Linux |
| Inode Number | 1046139 |
| Access Permission | rw-r--r-- |
| Owner Name | u63247146 |
| File Size | 256KB |
| File Size (bytes) | 262144 |
| Alphabetic List of Variables and Attributes | ||||
|---|---|---|---|---|
| # | Variable | Type | Len | Format |
| 4 | Awards | Char | 40 | |
| 6 | Birth_Date | Char | 10 | |
| 2 | Country | Char | 20 | |
| 7 | Discovery_Date | Num | 8 | DATE9. |
| 3 | Field | Char | 25 | |
| 8 | Impact_Score | Num | 8 | |
| 5 | Inventions | Char | 50 | |
| 1 | Scientist_Name | Char | 30 | |
proc print data=scientists(obs=5);
run;
OUTPUT:
| Obs | Scientist_Name | Country | Field | Awards | Inventions | Birth_Date | Discovery_Date | Impact_Score |
|---|---|---|---|---|---|---|---|---|
| 1 | Einstein | Germany | Physics | Nobel_Prize | Relativity | 14MAR1879 | 01JAN1905 | 98 |
| 2 | Newton | UK | Physics | None | Laws_of_Motion | 04JAN1643 | 01JAN1687 | 97 |
| 3 | Curie | Poland | Chemistry | Nobel_Prize | Radioactivity | 07NOV1867 | 01JAN1898 | 96 |
| 4 | Tesla | Serbia | Engineering | None | AC_Current | 10JUL1856 | 01JAN1887 | 95 |
| 5 | Darwin | UK | Biology | None | Evolution | 12FEB1809 | 01JAN1859 | 94 |
3.Create a Derived Dataset with Years Since Discovery
proc sql;
create table sci_analysis as
select *,
intck('year', Discovery_Date, today()) as Years_Since_Discovery
from scientists;
quit;
proc print data=sci_analysis;
run;
OUTPUT:
| Obs | Scientist_Name | Country | Field | Awards | Inventions | Birth_Date | Discovery_Date | Impact_Score | Years_Since_Discovery |
|---|---|---|---|---|---|---|---|---|---|
| 1 | Einstein | Germany | Physics | Nobel_Prize | Relativity | 14MAR1879 | 01JAN1905 | 98 | 120 |
| 2 | Newton | UK | Physics | None | Laws_of_Motion | 04JAN1643 | 01JAN1687 | 97 | 338 |
| 3 | Curie | Poland | Chemistry | Nobel_Prize | Radioactivity | 07NOV1867 | 01JAN1898 | 96 | 127 |
| 4 | Tesla | Serbia | Engineering | None | AC_Current | 10JUL1856 | 01JAN1887 | 95 | 138 |
| 5 | Darwin | UK | Biology | None | Evolution | 12FEB1809 | 01JAN1859 | 94 | 166 |
| 6 | Galileo | Italy | Astronomy | None | Telescope | 15FEB1564 | 01JAN1610 | 93 | 415 |
| 7 | Edison | USA | Engineering | None | Light_Bulb | 11FEB1847 | 01JAN1879 | 92 | 146 |
| 8 | Faraday | UK | Physics | None | Electromagnetism | 22SEP1791 | 01JAN1831 | 91 | 194 |
| 9 | Bohr | Denmark | Physics | Nobel_Prize | Atomic_Model | 07OCT1885 | 01JAN1913 | 90 | 112 |
| 10 | Feynman | USA | Physics | Nobel_Prize | Quantum_Theory | 11MAY1918 | 01JAN1948 | 89 | 77 |
| 11 | Hawking | UK | Physics | None | Black_Holes | 08JAN1942 | 01JAN1974 | 88 | 51 |
| 12 | Raman | India | Physics | Nobel_Prize | Raman_Effect | 07NOV1888 | 28FEB1928 | 87 | 97 |
4.DATE ADVANCEMENT USING INTNX
data sci_dates;
set sci_analysis;
Next_Review_Date = intnx('year', Discovery_Date, 10, 'same');
format Next_Review_Date date9.;
run;
proc print data=sci_dates;
run;
OUTPUT:
| Obs | Scientist_Name | Country | Field | Awards | Inventions | Birth_Date | Discovery_Date | Impact_Score | Years_Since_Discovery | Next_Review_Date |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Einstein | Germany | Physics | Nobel_Prize | Relativity | 14MAR1879 | 01JAN1905 | 98 | 120 | 01JAN1915 |
| 2 | Newton | UK | Physics | None | Laws_of_Motion | 04JAN1643 | 01JAN1687 | 97 | 338 | 01JAN1697 |
| 3 | Curie | Poland | Chemistry | Nobel_Prize | Radioactivity | 07NOV1867 | 01JAN1898 | 96 | 127 | 01JAN1908 |
| 4 | Tesla | Serbia | Engineering | None | AC_Current | 10JUL1856 | 01JAN1887 | 95 | 138 | 01JAN1897 |
| 5 | Darwin | UK | Biology | None | Evolution | 12FEB1809 | 01JAN1859 | 94 | 166 | 01JAN1869 |
| 6 | Galileo | Italy | Astronomy | None | Telescope | 15FEB1564 | 01JAN1610 | 93 | 415 | 01JAN1620 |
| 7 | Edison | USA | Engineering | None | Light_Bulb | 11FEB1847 | 01JAN1879 | 92 | 146 | 01JAN1889 |
| 8 | Faraday | UK | Physics | None | Electromagnetism | 22SEP1791 | 01JAN1831 | 91 | 194 | 01JAN1841 |
| 9 | Bohr | Denmark | Physics | Nobel_Prize | Atomic_Model | 07OCT1885 | 01JAN1913 | 90 | 112 | 01JAN1923 |
| 10 | Feynman | USA | Physics | Nobel_Prize | Quantum_Theory | 11MAY1918 | 01JAN1948 | 89 | 77 | 01JAN1958 |
| 11 | Hawking | UK | Physics | None | Black_Holes | 08JAN1942 | 01JAN1974 | 88 | 51 | 01JAN1984 |
| 12 | Raman | India | Physics | Nobel_Prize | Raman_Effect | 07NOV1888 | 28FEB1928 | 87 | 97 | 28FEB1938 |
5.STATISTICAL SUMMARY (PROC MEANS)
proc means data=sci_dates min max mean;
var Impact_Score Years_Since_Discovery;
run;
OUTPUT:
The MEANS Procedure
| Variable | Minimum | Maximum | Mean |
|---|---|---|---|
Impact_Score Years_Since_Discovery | 87.0000000 51.0000000 | 98.0000000 415.0000000 | 92.5000000 165.0833333 |
6.IMPACT RANKING USING PROC RANK
proc rank data=sci_dates out=sci_rank descending;
var Impact_Score;
ranks Impact_Rank;
run;
proc print data=sci_rank;
run;
OUTPUT:
| Obs | Scientist_Name | Country | Field | Awards | Inventions | Birth_Date | Discovery_Date | Impact_Score | Years_Since_Discovery | Next_Review_Date | Impact_Rank |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Einstein | Germany | Physics | Nobel_Prize | Relativity | 14MAR1879 | 01JAN1905 | 98 | 120 | 01JAN1915 | 1 |
| 2 | Newton | UK | Physics | None | Laws_of_Motion | 04JAN1643 | 01JAN1687 | 97 | 338 | 01JAN1697 | 2 |
| 3 | Curie | Poland | Chemistry | Nobel_Prize | Radioactivity | 07NOV1867 | 01JAN1898 | 96 | 127 | 01JAN1908 | 3 |
| 4 | Tesla | Serbia | Engineering | None | AC_Current | 10JUL1856 | 01JAN1887 | 95 | 138 | 01JAN1897 | 4 |
| 5 | Darwin | UK | Biology | None | Evolution | 12FEB1809 | 01JAN1859 | 94 | 166 | 01JAN1869 | 5 |
| 6 | Galileo | Italy | Astronomy | None | Telescope | 15FEB1564 | 01JAN1610 | 93 | 415 | 01JAN1620 | 6 |
| 7 | Edison | USA | Engineering | None | Light_Bulb | 11FEB1847 | 01JAN1879 | 92 | 146 | 01JAN1889 | 7 |
| 8 | Faraday | UK | Physics | None | Electromagnetism | 22SEP1791 | 01JAN1831 | 91 | 194 | 01JAN1841 | 8 |
| 9 | Bohr | Denmark | Physics | Nobel_Prize | Atomic_Model | 07OCT1885 | 01JAN1913 | 90 | 112 | 01JAN1923 | 9 |
| 10 | Feynman | USA | Physics | Nobel_Prize | Quantum_Theory | 11MAY1918 | 01JAN1948 | 89 | 77 | 01JAN1958 | 10 |
| 11 | Hawking | UK | Physics | None | Black_Holes | 08JAN1942 | 01JAN1974 | 88 | 51 | 01JAN1984 | 11 |
| 12 | Raman | India | Physics | Nobel_Prize | Raman_Effect | 07NOV1888 | 28FEB1928 | 87 | 97 | 28FEB1938 | 12 |
7.CUSTOM FORMATS USING PROC FORMAT
proc format;
value impactfmt
low - <85 = 'LOW IMPACT'
85 - <95 = 'HIGH IMPACT'
95 - high = 'LEGENDARY IMPACT';
run;
LOG:
data sci_format;
set sci_rank;
format Impact_Score impactfmt.;
run;
proc print data=sci_format;
run;
OUTPUT:
| Obs | Scientist_Name | Country | Field | Awards | Inventions | Birth_Date | Discovery_Date | Impact_Score | Years_Since_Discovery | Next_Review_Date | Impact_Rank |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Einstein | Germany | Physics | Nobel_Prize | Relativity | 14MAR1879 | 01JAN1905 | LEGENDARY IMPACT | 120 | 01JAN1915 | 1 |
| 2 | Newton | UK | Physics | None | Laws_of_Motion | 04JAN1643 | 01JAN1687 | LEGENDARY IMPACT | 338 | 01JAN1697 | 2 |
| 3 | Curie | Poland | Chemistry | Nobel_Prize | Radioactivity | 07NOV1867 | 01JAN1898 | LEGENDARY IMPACT | 127 | 01JAN1908 | 3 |
| 4 | Tesla | Serbia | Engineering | None | AC_Current | 10JUL1856 | 01JAN1887 | LEGENDARY IMPACT | 138 | 01JAN1897 | 4 |
| 5 | Darwin | UK | Biology | None | Evolution | 12FEB1809 | 01JAN1859 | HIGH IMPACT | 166 | 01JAN1869 | 5 |
| 6 | Galileo | Italy | Astronomy | None | Telescope | 15FEB1564 | 01JAN1610 | HIGH IMPACT | 415 | 01JAN1620 | 6 |
| 7 | Edison | USA | Engineering | None | Light_Bulb | 11FEB1847 | 01JAN1879 | HIGH IMPACT | 146 | 01JAN1889 | 7 |
| 8 | Faraday | UK | Physics | None | Electromagnetism | 22SEP1791 | 01JAN1831 | HIGH IMPACT | 194 | 01JAN1841 | 8 |
| 9 | Bohr | Denmark | Physics | Nobel_Prize | Atomic_Model | 07OCT1885 | 01JAN1913 | HIGH IMPACT | 112 | 01JAN1923 | 9 |
| 10 | Feynman | USA | Physics | Nobel_Prize | Quantum_Theory | 11MAY1918 | 01JAN1948 | HIGH IMPACT | 77 | 01JAN1958 | 10 |
| 11 | Hawking | UK | Physics | None | Black_Holes | 08JAN1942 | 01JAN1974 | HIGH IMPACT | 51 | 01JAN1984 | 11 |
| 12 | Raman | India | Physics | Nobel_Prize | Raman_Effect | 07NOV1888 | 28FEB1928 | HIGH IMPACT | 97 | 28FEB1938 | 12 |
8.MACRO FOR AUTOMATED RANKING REPORT
%macro impact_report(min_score);
proc sql;
select Scientist_Name, Field, Impact_Score, Impact_Rank
from sci_rank
where Impact_Score >= &min_score;
quit;
%mend;
%impact_report(90);
OUTPUT:
| Scientist_Name | Field | Impact_Score | Rank for Variable Impact_Score |
|---|---|---|---|
| Einstein | Physics | 98 | 1 |
| Newton | Physics | 97 | 2 |
| Curie | Chemistry | 96 | 3 |
| Tesla | Engineering | 95 | 4 |
| Darwin | Biology | 94 | 5 |
| Galileo | Astronomy | 93 | 6 |
| Edison | Engineering | 92 | 7 |
| Faraday | Physics | 91 | 8 |
| Bohr | Physics | 90 | 9 |
9.PROC FREQ
proc freq data=sci_format;
tables Field Country;
run;
OUTPUT:
The FREQ Procedure
| Field | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| Astronomy | 1 | 8.33 | 1 | 8.33 |
| Biology | 1 | 8.33 | 2 | 16.67 |
| Chemistry | 1 | 8.33 | 3 | 25.00 |
| Engineering | 2 | 16.67 | 5 | 41.67 |
| Physics | 7 | 58.33 | 12 | 100.00 |
| Country | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| Denmark | 1 | 8.33 | 1 | 8.33 |
| Germany | 1 | 8.33 | 2 | 16.67 |
| India | 1 | 8.33 | 3 | 25.00 |
| Italy | 1 | 8.33 | 4 | 33.33 |
| Poland | 1 | 8.33 | 5 | 41.67 |
| Serbia | 1 | 8.33 | 6 | 50.00 |
| UK | 4 | 33.33 | 10 | 83.33 |
| USA | 2 | 16.67 | 12 | 100.00 |
No comments:
Post a Comment