282.100+ HISTORIC BUSINESSES WORLDWIDE DATASET | PROC CONTENTS | PROC PRINT | PROC FREQ | PROC MEANS | PROC SORT | PROC REPORT| PROC SQL |MACRO SHOW_TOP_INDUSTRIES | MACRO COUNTRY_REPORT

100+ HISTORIC BUSINESSES WORLDWIDE DATASET | PROC CONTENTS | PROC PRINT | PROC FREQ | PROC MEANS | PROC SORT | PROC REPORT| PROC SQL |MACRO SHOW_TOP_INDUSTRIES | MACRO COUNTRY_REPORT

 /*CREATING A DATASET OF 100+ HISTORIC BUSINESSES */

1.Macro lists for generating plausible names, countries, industries and founders 

%let prefixes = Abernathy Burling Cortes Hanover Silkwood Tudor Imperial Royal Eastern

                Atlantic Pioneer Union Continental Heritage OldTown Windsor Blackwell 

                Greenfield Lancaster Brighton Franklin Sterling Mayfair Cambridge;


%let suffixes = %str(Trading Co) %str(Mercantile) %str(& Co.) %str(Foundry Works) 

                %str(Emporium) %str(& Sons) %str(& Partners) %str(Consortium) 

                %str(Guild) %str(Company) %str(& Company) %str(Manufactories) 

                %str(Shipping House) %str(Warehouses) %str(Baker & Co.) %str(Brewer & Co.);


%let countries = United_Kingdom United_States France Germany Italy Spain India China Japan

                 Netherlands Portugal Sweden Norway Denmark Belgium Switzerland Austria 

                 Turkey Russia Brazil Mexico Australia Canada South_Africa Egypt Morocco 

                 Saudi_Arabia UAE Singapore Hong_Kong Indonesia Malaysia Vietnam Thailand;


%let industries = Textiles Shipping Brewing Banking Metalworking Ceramics Agriculture

                  SpiceTrading Printing Publishing Glassmaking Carpentry Shipbuilding 

                  SilkTrade TeaTrade CoffeeTrade Insurance Retailing Mining LeatherTanning 

                  Jewelry Watchmaking Porcelain;


%let founders = Thomas_Abbott William_Burling Jose_Cortes Hans_Hanover Li_Silkwood Raj_Tudor

                Alessandro_Imperial Maria_Royal Omar_Eastern John_Atlantic Pierre_Pioneer 

                Ana_Union Karl_Continental Isabel_Heritage Suresh_OldTown Henry_Windsor 

                Paul_Blackwell Grace_Greenfield Marcus_Lancaster Sophia_Brighton 

                Noah_Franklin Emily_Sterling;


2.Count words in macro lists 

%let npref = %sysfunc(countw(&prefixes));

%let nsuf  = %sysfunc(countw(&suffixes));

%let nctry = %sysfunc(countw(&countries));

%let nind  = %sysfunc(countw(&industries));

%let nfound= %sysfunc(countw(&founders));


options nocenter;

data old_businesses;

  length Business_ID 8 Name $60 Country $30 Founded_Year 8 

         Industry $30 Founder $30 Status $8;

  call streaminit(12345);

  do Business_ID = 1 to 110;

    /* Pick prefix, suffix, country, industry, founder deterministically for coverage */

    pref = scan("&prefixes", mod(Business_ID-1,&npref)+1, ' ');

    suf  = scan("&suffixes", mod(Business_ID-1,&nsuf)+1, ' ');

    Name = catx(' ', pref, suf);

    Country = tranwrd(scan("&countries", mod(Business_ID-1,&nctry)+1, ' '), '_', ' ');

    Industry = scan("&industries", mod(Business_ID-1,&nind)+1, ' ');

    Founder  = tranwrd(scan("&founders", mod(Business_ID-1,&nfound)+1, ' '), '_', ' ');

    /* Older businesses: random year between 1600 and 1950, with 10% modern */

    if rand('uniform') < 0.10 then 

      Founded_Year = 1951 + floor(rand('uniform')*70); 

    else 

      Founded_Year = 1600 + floor(rand('uniform')*351);

    /* Status: many old firms defunct, some still active */

    if Founded_Year < 1850 then do;

      Status = ifc(rand('uniform')<0.5,'Defunct','Active');

    end;

    else do;

      Status = ifc(rand('uniform')<0.7,'Active','Defunct');

    end;

    output;

  end;

  drop pref suf;

run;

proc print data=old_businesses(obs=20);

  title "Sample of Historic Businesses";

run;

Output:

Sample of Historic Businesses

ObsBusiness_IDNameCountryFounded_YearIndustryFounderStatus
11Abernathy TradingUnited Kingdom1948TextilesThomas AbbottActive
22Burling CoUnited States1889ShippingWilliam BurlingActive
33Cortes MercantileFrance1734BrewingJose CortesDefunct
44Hanover &Germany1781BankingHans HanoverActive
55Silkwood Co.Italy1927MetalworkingLi SilkwoodActive
66Tudor FoundrySpain1635CeramicsRaj TudorActive
77Imperial WorksIndia1895AgricultureAlessandro ImperialDefunct
88Royal EmporiumChina1632SpiceTradingMaria RoyalActive
99Eastern &Japan1845PrintingOmar EasternActive
1010Atlantic SonsNetherlands1930PublishingJohn AtlanticActive
1111Pioneer &Portugal1863GlassmakingPierre PioneerDefunct
1212Union PartnersSweden1767CarpentryAna UnionActive
1313Continental ConsortiumNorway1895ShipbuildingKarl ContinentalDefunct
1414Heritage GuildDenmark1885SilkTradeIsabel HeritageActive
1515OldTown CompanyBelgium1738TeaTradeSuresh OldTownActive
1616Windsor &Switzerland1850CoffeeTradeHenry WindsorActive
1717Blackwell CompanyAustria1724InsurancePaul BlackwellDefunct
1818Greenfield ManufactoriesTurkey1798RetailingGrace GreenfieldDefunct
1919Lancaster ShippingRussia1989MiningMarcus LancasterDefunct
2020Brighton HouseBrazil1952LeatherTanningSophia BrightonDefunct

3.PROC CONTENTS - 

purpose: show dataset structure and variable attributes in one listing 

proc contents data=old_businesses position;

run; 

Output:

The CONTENTS Procedure

Data Set NameWORK.OLD_BUSINESSESObservations110
Member TypeDATAVariables7
EngineV9Indexes0
Created09/07/2025 18:22:48Observation Length176
Last Modified09/07/2025 18:22:48Deleted Observations0
Protection CompressedNO
Data Set Type SortedNO
Label   
Data RepresentationSOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LINUX_IA64  
Encodingutf-8 Unicode (UTF-8)  
Engine/Host Dependent Information
Data Set Page Size131072
Number of Data Set Pages1
First Data Page1
Max Obs per Page743
Obs in First Data Page110
Number of Data Set Repairs0
Filename/saswork/SAS_work662C0001A968_odaws02-apse1-2.oda.sas.com/SAS_work88120001A968_odaws02-apse1-2.oda.sas.com/old_businesses.sas7bdat
Release Created9.0401M8
Host CreatedLinux
Inode Number1186018
Access Permissionrw-r--r--
Owner Nameu63247146
File Size256KB
File Size (bytes)262144
Alphabetic List of Variables and Attributes
#VariableTypeLen
1Business_IDNum8
3CountryChar30
4Founded_YearNum8
6FounderChar30
5IndustryChar30
2NameChar60
7StatusChar8
Variables in Creation Order
#VariableTypeLen
1Business_IDNum8
2NameChar60
3CountryChar30
4Founded_YearNum8
5IndustryChar30
6FounderChar30
7StatusChar8

4.PROC PRINT - 

purpose: print first 15 observations for a quick visual check 

proc print data=old_businesses(obs=15) label;

  title 'First 15 Historic Businesses - Quick Look';

run;  

Output:

First 15 Historic Businesses - Quick Look

ObsBusiness_IDNameCountryFounded_YearIndustryFounderStatus
11Abernathy TradingUnited Kingdom1948TextilesThomas AbbottActive
22Burling CoUnited States1889ShippingWilliam BurlingActive
33Cortes MercantileFrance1734BrewingJose CortesDefunct
44Hanover &Germany1781BankingHans HanoverActive
55Silkwood Co.Italy1927MetalworkingLi SilkwoodActive
66Tudor FoundrySpain1635CeramicsRaj TudorActive
77Imperial WorksIndia1895AgricultureAlessandro ImperialDefunct
88Royal EmporiumChina1632SpiceTradingMaria RoyalActive
99Eastern &Japan1845PrintingOmar EasternActive
1010Atlantic SonsNetherlands1930PublishingJohn AtlanticActive
1111Pioneer &Portugal1863GlassmakingPierre PioneerDefunct
1212Union PartnersSweden1767CarpentryAna UnionActive
1313Continental ConsortiumNorway1895ShipbuildingKarl ContinentalDefunct
1414Heritage GuildDenmark1885SilkTradeIsabel HeritageActive
1515OldTown CompanyBelgium1738TeaTradeSuresh OldTownActive

5.PROC FREQ - 

purpose: frequency distribution for categorical variables 

proc freq data=old_businesses;

  tables Country Industry Status / nocum nopercent;

run; 

Output:

The FREQ Procedure

CountryFrequency
Australia3
Austria3
Belgium3
Brazil3
Canada3
China4
Denmark3
Egypt3
France4
Germany4
Hong Kong3
India4
Indonesia3
Italy4
Japan3
Malaysia3
Mexico3
Morocco3
Netherlands3
Norway3
Portugal3
Russia3
Saudi Arabia3
Singapore3
South Africa3
Spain4
Sweden3
Switzerland3
Thailand3
Turkey3
UAE3
United Kingdom4
United States4
Vietnam3
IndustryFrequency
Agriculture5
Banking5
Brewing5
Carpentry5
Ceramics5
CoffeeTrade5
Glassmaking5
Insurance5
Jewelry4
LeatherTanning4
Metalworking5
Mining4
Porcelain4
Printing5
Publishing5
Retailing5
Shipbuilding5
Shipping5
SilkTrade5
SpiceTrading5
TeaTrade5
Textiles5
Watchmaking4
StatusFrequency
Active68
Defunct42

6.PROC MEANS - 

purpose: basic numeric summary of Founded_Year 

proc means data=old_businesses min mean median max std n;

  var Founded_Year;

run; 

Output:

The MEANS Procedure

Analysis Variable : Founded_Year
MinimumMeanMedianMaximumStd DevN
1601.001809.611824.002012.00105.9860728110

7.PROC SORT + PROC REPORT - 

purpose: sorted grouped report showing counts by country and earliest founded year 

proc sort data=old_businesses out=obs_sorted; by Country Founded_Year; 

run;

proc print data=obs_sorted(obs=20);

run;

Output:

ObsBusiness_IDNameCountryFounded_YearIndustryFounderStatus
190Greenfield FoundryAustralia1668JewelryWilliam BurlingActive
222Sterling TradingAustralia1840WatchmakingEmily SterlingDefunct
356Royal GuildAustralia1887PublishingAna UnionDefunct
417Blackwell CompanyAustria1724InsurancePaul BlackwellDefunct
585Continental TradingAustria1772CoffeeTradeMarcus LancasterActive
651Cortes &Austria1887MetalworkingAlessandro ImperialActive
715OldTown CompanyBelgium1738TeaTradeSuresh OldTownActive
849Abernathy WorksBelgium1757BrewingLi SilkwoodDefunct
983Pioneer HouseBelgium1803SilkTradePaul BlackwellDefunct
1088Windsor &Brazil1635MiningEmily SterlingActive
1154Tudor PartnersBrazil1862SpiceTradingJohn AtlanticDefunct
1220Brighton HouseBrazil1952LeatherTanningSophia BrightonDefunct
1391Lancaster WorksCanada1674WatchmakingJose CortesActive
1423Mayfair CoCanada1680PorcelainThomas AbbottDefunct
1557Eastern CompanyCanada1864GlassmakingKarl ContinentalActive
168Royal EmporiumChina1632SpiceTradingMaria RoyalActive
17110Heritage Co.China1739RetailingEmily SterlingActive
1842Greenfield WarehousesChina1881MiningSophia BrightonActive
1976Hanover ConsortiumChina1936AgricultureJohn AtlanticActive
2014Heritage GuildDenmark1885SilkTradeIsabel HeritageActive


proc report data=obs_sorted nowd;

  column Country n Founded_Year=MinYear Founded_Year=MaxYear;

  define Country / group;

  define n / 'Count';

  define MinYear / min 'Earliest Founded Year';

  define MaxYear / max 'Latest Founded Year';

run;

Output:

CountryCountEarliest Founded YearLatest Founded Year
Australia316681887
Austria317241887
Belgium317381803
Brazil316351952
Canada316741864
China416321936
Denmark318851953
Egypt316561983
France417341951
Germany417491962
Hong Kong316011886
India417871895
Indonesia316591937
Italy417681927
Japan316421920
Malaysia317031898
Mexico316721844
Morocco317011837
Netherlands317001930
Norway317261895
Portugal317221863
Russia316631989
Saudi Arabia316391699
Singapore318052012
South Africa318411889
Spain416211707
Sweden317671849
Switzerland318111850
Thailand319001901
Turkey316941798
UAE317141940
United Kingdom417451979
United States416601956
Vietnam316921983


8.PROC SQL - 

purpose: create summary tables and an example join / aggregation 

proc sql;

  create table industry_summary as

  select Industry,

         count(*) as N_Businesses,

         min(Founded_Year) as Earliest format=4.,

         max(Founded_Year) as Latest format=4.

  from old_businesses

  group by Industry

  order by N_Businesses desc;

quit; 

proc print data=industry_summary(obs=20);run;

Output:

ObsIndustryN_BusinessesEarliestLatest
1CoffeeTrade516211876
2SpiceTrading516321920
3TeaTrade516991863
4Printing516921898
5Shipping517701953
6Ceramics516351938
7Textiles517401983
8Metalworking516992012
9Brewing516391950
10Glassmaking517451900
11Publishing517221983
12Shipbuilding516561951
13Carpentry517261896
14SilkTrade517011962
15Agriculture516761936
16Banking516501940
17Insurance516941805
18Retailing516011850
19Porcelain416801979
20Mining416351989

9.Macro to print top N industries with a custom header 

%macro show_top_industries(n=5);

  %put NOTE: Showing top &n industries by business count;

  proc print data=industry_summary(obs=&n) noobs; title "Top &n Industries"; run;

%mend show_top_industries;


%show_top_industries(n=8) 

Output:

Top 8 Industries

IndustryN_BusinessesEarliestLatest
CoffeeTrade516211876
SpiceTrading516321920
TeaTrade516991863
Printing516921898
Shipping517701953
Ceramics516351938
Textiles517401983
Metalworking516992012

10.Macro to create a country-specific dataset and basic stats 

%macro country_report(country=United States);

  %let cname = %sysfunc(tranwrd(&country, %str( ), _));

  proc sql;

    create table &cname._biz as

    select * from old_businesses

    where Country = "%sysfunc(tranwrd(&country,_ , ))";

  quit;

  proc means data=&cname._biz min mean median max n;

    var Founded_Year;

    title "Summary for &country";

  run;

%mend country_report;


%country_report(country=United States) 

Output:

Summary for United States

The MEANS Procedure

Analysis Variable : Founded_Year
MinimumMeanMedianMaximumN
1660.001810.001812.001956.004

11.Validation / QC: check for duplicates on Name + Founded_Year 

proc sort data=old_businesses nodupkey dupout=dups; 

 by Name Founded_Year;

run; 

proc print data=dups; 

 title 'Duplicate Name-Founded_Year combinations (if any)';

run; 

Log:

NOTE: 0 observations with duplicate key values were deleted.

NOTE: No observations in data set WORK.DUPS.




To Visit My Previous Online Streaming Flatform Dataset:Click Here
To Visit My Previous Statistical Evaluation Of Clinical Trials:Click Here
To Visit My Previous Unlocking Retail Insights Dataset:Click Here
To Visit My Previous Sas Interview Questions-1:Click Here



Follow Us On : 


 


--- FOLLOW OUR BLOG FOR MORE INFORMATION.

--->PLEASE DO COMMENTS AND SHARE OUR BLOG.


Comments

Popular posts from this blog

409.Can We Build a Reliable Emergency Services Analytics & Fraud Detection System in SAS While Identifying and Fixing Intentional Errors?

397.If a satellite has excellent signal strength but very high latency, can it still deliver good quality communication? Why or why not?A Sas Study

401.How Efficient Are Global Data Centers? A Complete SAS Analytics Study