Data Access Policy

This site provides access to data from my published research. Each data download includes a readme file with futher information, including the reference paper that users should cite as data source.

Data by topic:

Occupation Codes
Occupational Tasks
Industry Codes
Industry Trade Exposure
Local Labor Market Geography
County Industry Structure
Political Geography
Political Outcomes
Patents
File Archives by Paper



[A] Occupation Codes

The occ1990dd occupation classification aggregates U.S. Census occupation codes to a balanced panel of occupations for the 1980, 1990, and 2000 Census, as well as the 2005-2008 ACS. The files below also allow to build an unbalanced panel of occ1990dd codes for the Census years 1950, 1960 and 1970.

Crosswalk Files

  • [A1] 1950 Census occ to occ1990dd.
  • [A2] 1960 Census occ to occ1990dd.
  • [A3] 1970 Census occ to occ1990dd.
  • [A4] 1980 Census occ to occ1990dd.
  • [A5] 1990 Census occ to occ1990dd.
  • [A6] 2000 Census occ to occ1990dd.
  • [A7] 2005 ACS occ to occ1990dd.
  • [A8] 2010 Census occ to occ1990dd.
  • [A9] Aggregation of occ1990dd to occupation groups.

Additional Resources

  • [A10] Construction of occ1990dd occupation codes.

References

  • For [A1] to [A8] and [A10]: David Autor and David Dorn. "The Growth of Low Skill Service Jobs and the Polarization of the U.S. Labor Market." American Economic Review, 103(5), 1553-1597, 2013.
  • For [A9]: David Autor. "Why Are There Still So Many Jobs? The History and Future of Workplace Automation." Journal of Economic Perspectives, 29(3), 3-30, 2015.
  • For [A10]: David Dorn. "Essays on Inequality, Spatial Interaction, and the Demand for Skills." Dissertation University of St. Gallen no. 3613, September 2009.


[B] Occupational Tasks

The files below provide task data for occ1990dd occupations. Abstract, routine and manual tasks in file [B1] are based on data from the Dictionary of Occupational Titles 1977 while offshorability in file [B2] is based on task values from O*Net.

Data Files

  • [B1] Abstract, routine and manual task content of occ1990dd occupations.
  • [B2] Offshorability of occ1990dd occupations.

Reference for [B1] and [B2]

  • David Autor and David Dorn. "The Growth of Low Skill Service Jobs and the Polarization of the U.S. Labor Market." American Economic Review, 103(5), 1553-1597, 2013.


[C] Industry Codes

File [C1] provides a weighted crosswalk from NAICS 1997 6-digit to SIC 1987 4-digit industry codes. Data on employment by NAICS-SIC cells is provided by the U.S. Census Bureau, though some employment counts are only reported in brackets. The construction of the crosswalk file imputes employment within brackets based on information about establishment counts by NAICS-SIC cells, and data on the average number of employees per establishment by industry from the County Business Pattern data (see [F] below). File [C2] aggregates SIC 1987 4-digit codes for manufacturing industries such that each resulting industry maps to one or several HS 6-digit product codes. File [C3] further aggregates these industries to 10 broad manufacturing sectors. File [C4] aggregates the consistent Census industry code ind1990 to a balanced panel of industries for the 1980, 1990 and 2000 Census and the 2006-2008 ACS. Files [C5] to [C7] construct a more detailed balanced industry panel for the 1980, 1990 and 2000 Census, which draws on each year's original industry codes and includes a weighted crosswalk for the 2000 Census. Each SIC 1987 4-digit industry can be mapped to one ind1990ddx code using file [C8].

Crosswalk Files

  • [C1] NAICS97 6-digit to SIC87 4-digit.
  • [C2] SIC87 4-digit to SIC87dd 4-digit.
  • [C3] Aggregation of SIC87dd manufacturing industry codes to 10 manufacturing subsectors.
  • [C4] Census ind1990 to ind1990dd.
  • [C5] 1980 Census ind to ind1990ddx.
  • [C6] 1990 Census ind to ind1990ddx.
  • [C7] 2000 Census ind to ind1990ddx.
  • [C8] SIC87 to ind1990ddx.

Additional Resources

  • [C9] List of SIC87 and SIC87dd and corresponding ind1990dd manufacturing industry codes.
  • [C10] Comparison of 397 SIC87dd industry panel of Autor, Dorn and Hanson (2013) with compressed 392 SIC87dd industry panel of Acemoglu, Autor, Dorn, Hanson and Price (2016).

References

  • For [C1], [C2], [C4], and [C9]: David Autor, David Dorn and Gordon Hanson. "The China Syndrome: Local Labor Market Effects of Import Competition in the United States." American Economic Review, 103(6), 2121-2168, 2013.
  • For [C3]: David Autor, David Dorn, Gordon Hanson and Jae Song. "Trade Adjustment: Worker Level Evidence." Quarterly Journal of Economics, 129(4), 1799-1860, 2014.
  • For [C5] to [C8]: David Autor, David Dorn and Gordon Hanson. "When Work Disappears: Manufacturing Decline and the Falling Marriage-Market Value of Young Men." American Economic Review: Insights, 1(2): 161-178, 2019.
  • For [C10]: Daron Acemoglu, David Autor, David Dorn, Gordon Hanson and Brendan Price. "Import Competition and the Great U.S. Employment Sag of the 2000s." Journal of Labor Economics, 34(S1): S141-S198, 2016.


[D] Industry Trade Exposure

We concord 6-digit HS product-level trade data from U.N. Comtrade to 4-digit sic87dd manufacturing industries using the crosswalk file [D4]. File [D1] reports annual nominal imports of manufactured goods by the United States and a group of eight other wealthy countries (Germany, Switzerland, Spain, Denmark, Finland, Japan, Australia, New Zealand) for which comparable trade data is available since 1991. The exporters are China, other low-wage countries, Mexico and CAFTA, USA, Canada, and rest of the world. File [D2] provides real industry-level imports of Chinese manufactures in the U.S. and in the eight other wealthy countries, as well as values of US domestic absorption. File [D3] indicates industries' exposure to Chinese import penetration at the level or their supplier or customer industries.

Data Files

  • [D1] Trade flows by sic87dd industry, importer, exporter, and year 1991-2014.
  • [D2] Imports from China by sic87dd industry in US and other wealthy countries, 1991-2014.
  • [D3] Change upstream and downstream import exposure by sic87dd industry, 1991-2011 and subperiods.

Crosswalk File

  • [D4] HS 6-digit to sic87dd 4-digit.

References

  • For [D1] and [D2]: David Autor, David Dorn and Gordon Hanson. "When Work Disappears: Manufacturing Decline and the Falling Marriage-Market Value of Young Men." American Economic Review: Insights, 1(2): 161-178, 2019.
  • For [D3]: Daron Acemoglu, David Autor, David Dorn, Gordon Hanson and Brendan Price. "Import Competition and the Great U.S. Employment Sag of the 2000s." Journal of Labor Economics, 34(S1): S141-S198, 2016.
  • For [D4]: David Autor, David Dorn and Gordon Hanson. "The China Syndrome: Local Labor Market Effects of Import Competition in the United States." American Economic Review, 103(6), 2121-2168, 2013.


[E] Local Labor Market Geography

Commuting Zones (CZs) provide a local labor market geography that covers the entire land area of the United States. CZs are clusters of U.S. counties that are characterized by strong within-cluster and weak between-cluster commuting ties. The crosswalk files [E1] to [E6] below provide a probabilistic matching of sub-state geographic units in U.S. Census Public Use Files to CZs. The variable afactor indicates which fraction of a SEA/County Group/PUMA maps to a given CZ. The file [E7] provides a mapping from counties to CZs based on county definitions in 1990. File [E10] provides information on changes of county codes, and indicates how counties that split, merged or were renamed can be assigned to 1990-based CZs. The files [E8] and [E9] map CZs to the states and Census divisions that comprise the largest share of a CZ's population.

Crosswalk Files

  • [E1] 1950 Census State Economic Areas to 1990 Commuting Zones.
  • [E2] 1970 Census County Groups to 1990 Commuting Zones.
  • [E3] 1980 Census County Groups to 1990 Commuting Zones.
  • [E4] 1990 Census Public Use Micro Areas to 1990 Commuting Zones.
  • [E5] 2000 Census and 2005-2011 ACS Public Use Micro Areas to 1990 Commuting Zones.
  • [E6] 2010 Census and 2012-ongoing ACS Public Use Micro Areas to 1990 Commuting Zones.
  • [E7] 1990 Counties to 1990 Commuting Zones.
  • [E8] 1990 Commuting Zones to States.
  • [E9] 1990 Commuting Zones to Census Divisions.

Additional Resources

  • [E10] Changes of county codes.
  • [E11] Construction of geography crosswalks.

References for [E1] to [E11]

  • For [E1] to [E5] and [E7] to [E10]: David Autor and David Dorn. "The Growth of Low Skill Service Jobs and the Polarization of the U.S. Labor Market." American Economic Review, 103(5), 1553-1597, 2013.
  • For [E6]: David Autor, David Dorn and Gordon Hanson. "When Work Disappears: Manufacturing Decline and the Falling Marriage-Market Value of Young Men." American Economic Review: Insights, 1(2): 161-178, 2019.
  • For [E11]: David Dorn. "Essays on Inequality, Spatial Interaction, and the Demand for Skills." Dissertation University of St. Gallen no. 3613, September 2009.


[F] County Industry Structure

The publicly available County Business Pattern data provides employment counts by county by industry. The employment numbers are often reported only in brackets. File [F1] provides imputed employment counts by county and 4-digit sic87dd industry for selected years which were derived using the fixed point imputation algorithm described in Autor, Dorn and Hanson (AER 2013).

Data Cleaner Files

  • [F1] CBP county-level employment by industry, 1988, 1991, 1999, 2007 and 2011.

Reference for [F1]

  • Daron Acemoglu, David Autor, David Dorn, Gordon Hanson and Brendan Price. "Import Competition and the Great U.S. Employment Sag of the 2000s." Journal of Labor Economics, 34(S1): S141-S198, 2016.


[G] Political Geography

Files [G1] to [G4] match districts of the 109th, 110th, 113th and 115th U.S. Congress (elected in 2004, 2006, 2012 and 2016) to districts of the 108th Congress (elected in 2002). Files [G5] to [G8] provide more detailed crosswalks between county-district cells. The crosswalks for the 109th, 110th, and 115th Congress comprise only the subset of states where district boundaries changed outside the usual decennial schedule.

Crosswalk Files

  • [G1] 109th Congress districts to 108th Congress districts.
  • [G2] 110th Congress districts to 108th Congress districts.
  • [G3] 113th Congress districts to 108th Congress districts.
  • [G4] 115th Congress districts to 108th Congress districts.
  • [G5] 109th Congress county-district cells to 108th Congress county-district cells.
  • [G6] 110th Congress county-district cells to 108th Congress county-district cells.
  • [G7] 113th Congress county-district cells to 108th Congress county-district cells.
  • [G8] 115th Congress county-district cells to 108th Congress county-district cells.

Reference for [G1] to [G8]

  • David Autor, David Dorn, Gordon Hanson and Kaveh Majlesi. "Importing Political Polarization? The Electoral Consequences of Rising Trade Exposure." American Economic Review, 110(10): 3139-3189, 2020.


[H] Political Outcomes

File [H1] classifies members of Congress into liberal Democrats, moderate Democrats, moderate Republicans and conservative Republicans based on the average ideology scores of their campaign donors during an electoral cycle. Donor ideology is based on the DIME database. File [H2] indicates whether a legislator was ever affiliated with a caucus that is connected to the Tea Party movement. File [H3] lists all legislators who were elected to Congress in general elections from 2002 to 2016, including their names, parties, ICPSR IDs, congressional district numbers, and states (state abbreviations, FIPS codes, and ICPSR codes). It is useful for linking other files that provide information on politicians and congressional districts.

Data Files

  • [H1] Liberal, moderate and conservative legislators in U.S. Congress, 2002-2016.
  • [H2] Members of the Tea Party, Liberty and Freedom Caucuses, 2010-2016.
  • [H3] Winners of general elections for U.S. Congress, 2002-2016.

Reference for [H1] to [H3]

  • David Autor, David Dorn, Gordon Hanson and Kaveh Majlesi. "Importing Political Polarization? The Electoral Consequences of Rising Trade Exposure." American Economic Review, 110(10): 3139-3189, 2020.


[I] Patents

File [I1] matches corporate patents granted by the USPTO between 1975 and March 2013 to Compustat firm identification numbers (GVKEY). The matching is based on the firm names that are indicated on patent records, which are sometimes abbreviated or misspelled. The algorithm documented in file package [I2] leverages a web search engine to match the many company name variations found on patents to the corresponding firm records.

Crosswalk File

  • [I1] 1975-2013 USPTO patents to Compustat GVKEY.

Additional Resources

  • [I2] Documentation of the internet-based name matching algorithm.

Reference for [I1] and [I2]

  • David Autor, David Dorn, Gordon Hanson, Gary Pisano and Pian Shu. "Foreign Competition and Domestic Innovation: Evidence from U.S. Patents." American Economic Review: Insights, 2(3): 357-374, 2020.


[P] File Archives by Publication

The archives below provide project-specific packages of Stata data files, do files, log files and graphs, as well as tables and figures in Excel format. These files have also been published on the websites of the corresponding journals according to journals' data policies. Please check the list of data by topic above for updated and supplemetary data files.

File Packages

  • [P1] David Autor and David Dorn. "This Job Is 'Getting Old:' Measuring Changes in Job Opportunities Using Occupational Age Structure."
    American Economic Review, P&P, 99(2), 45-51, 2009.
  • [P2] David Autor and David Dorn. "The Growth of Low Skill Service Jobs and the Polarization of the U.S. Labor Market."
    American Economic Review, 103(5), 1553-1597, 2013.
  • [P3] David Autor, David Dorn and Gordon Hanson. "The China Syndrome: Local Labor Market Effects of Import Competition in the United States."
    American Economic Review, 103(6), 2121-2168, 2013.
  • [P4] Daron Acemoglu, David Autor, David Dorn, Gordon Hanson and Brendan Price. "Return of the Solow Paradox? IT, Productivity and Employment in U.S. Manufacturing."
    American Economic Review, P&P, 104(5), 394-399, 2014.
  • [P5] David Autor, David Dorn, and Gordon Hanson. "Untangling Trade and Technology: Evidence from Local Labor Markets."
    Economic Journal, 125(584), 621-646, 2015.
  • [P6] Daron Acemoglu, David Autor, David Dorn, Gordon Hanson and Brendan Price. "Import Competition and the Great U.S. Employment Sag of the 2000s."
    Journal of Labor Economics, 34(S1): S141-S198, 2016.
  • [P7] David Autor, David Dorn and Gordon Hanson. "When Work Disappears: Manufacturing Decline and the Falling Marriage-Market Value of Young Men."
    American Economic Review: Insights, 1(2): 161-178, 2019.
  • [P8] David Autor, David Dorn, Lawrence Katz, Christina Patterson and John Van Reenen. "The Fall of the Labor Share and the Rise of Superstar Firms."
    Quarterly Journal of Economics, 135(2): 645-709, 2020.
  • [P9] David Autor, David Dorn, Gordon Hanson and Kaveh Majlesi. "Importing Political Polarization? The Electoral Consequences of Rising Trade Exposure."
    American Economic Review, 110(10): 3139-3189, 2020.
  • [P10] David Autor, David Dorn, Gordon Hanson, Gary Pisano and Pian Shu. "Foreign Competition and Domestic Innovation: Evidence from U.S. Patents."
    American Economic Review: Insights, 2(3): 357-374, 2020.
  • [P11] David Dorn and Josef Zweimüller. "Migration and Labor Market Integration in Europe."
    Journal of Economic Perspectives, 35(2): 49-76, 2021.