Veterans Affairs banner with U.S. FlagVeterans Affairs banner with U.S. Flag
HEALTH ECONOMICS RESOURCE CENTERSpacer

I. Finding and Using Healthcare Data

14. How do I find non-VA health care data sets?

There are a large number of data sets available for research use that are potentially useful for health care research. This list includes some of the more important ones, but it is not a comprehensive list. The Inter-university Consortium for Political and Social Research (ICPSR) based at the University of Michigan is a source for many data sets. Researchers should check this source before purchasing data, as it may be available for free from the ICPSR. There is related information in other FAQs in this section (I. Finding and using healthcare data) of the HERC web site.

Medicare Patient Level Data Medicare

Medicare is a source for many different types of data. The main limitation of Medicare data is that they only include Medicare patients, who are predominately those aged 65 and over. Data are available for inpatient and outpatient care, and for provider (physicians) and facilities. A complete list of the data available for purchase is on the CMS web site. Because of the complexity of the Medicare data, CMS supports a university-based service that provides free assistance to researchers.

A linkage of VA and Medicare data is available to VA researchers through the Veterans Affairs Information Resource Center (VIREC). Other Sources of Patient Level Data

Most states have hospital discharge data that they make available to researchers. The Hospital Cost and Utilization Project at the Agency for Healthcare Research and Quality (AHRQ) has assembled many of these. From these AHRQ has created the Nationwide Inpatient Sample (NIS) which has a 20% sample of inpatient discharges. AHRQ also acts as a clearing house for researchers who wish to purchase state discharge data in a common format. The state data are also available directly from those states that make the data available to researchers. Many states have data elements that are not available on the NIS. Many states are now linking their discharge data to other data, such as death files. Researchers need to contact the individual states, as most states put additional restrictions on access to linked data sets. States are expanding the scope of their collection of health data all of the time. For example, California now requires hospitals to report all outpatient surgeries and emergency department visits.

A major limitation of the state discharge data is that they do not include physician costs or any outpatient costs. There are sources of these data, but they are not population based. In addition to Medicare, information on physician charges and ambulatory care can be obtained from state Medicaid data and from private sector vendors that have compiled claims data. The private sector vendors include Ingenix and MedStat.

Provider Payment Data

Medicare provides downloadable data sets with information about all of its payments. Link to a comprehensive list of the Medicare public use data downloads. Link to the information for the Medicare Hospital Outpatient Prospective Payment System (APCs).

Medicare does not pay for all health care services. Ingenix, has compiled a more comprehensive list of provider payment relative value units that are based on the Medicare provider payment methodology.

Other Sources of Data

The Center for Studying Health System Change's Community Tracking Study (CTS) is a unique resource. The CTS is compiling extensive longitudinal data about 12 metropolitan areas selected to be representative of the entire country.

The American Medical Association (AMA) and the American Hospital Association (AHA) are good sources of data about physicians and hospitals. These data are based on surveys, but the very high response rates results in data that are more like population data. The one disadvantages of these data is that they are fairly expensive.

The links that follow contain additional information on the AMA Annual Survey of Physiciansand the AHA Annual Survey of Hospitals.

The Area Resources File, has extensive county-level data that is compiled from a wide range of sources. The ARF is a longitudinal file; the current version of the file contains all previous year's data. These time series vary in length. Other government offices such as the Census, and the Bureau of Labor Statistics, are also useful sources of data.

Many types of population-based data are available from the National Center for Health Statistics. These include vital statistics data such as the Multiple Cause of Death and Mortality Detail. A complete list of the data available is on the NCHS web page.

When looking for data, another useful source is the Department of Health and Human Services Directory of Federal data bases.

Survey Data

There are many surveys that are done for various reasons. Some are population based, while others focus on specific groups. Without any annotation, this section lists the name of the survey and the web link for additional information.

Reviewed/Updated Date: November 21, 2007