HERC: HERC's MCA Discharge Dataset with Subtotals for Inpatient Categories of Care
Attention A T users. To access the menus on this page please perform the following steps. 1. Please switch auto forms mode to off. 2. Hit enter to expand a main menu option (Health, Benefits, etc). 3. To enter and activate the submenu links, hit the down arrow. You will now be able to tab or arrow up or down through the submenu options to access/activate the submenu links.

HERC's MCA Discharge Dataset with Subtotals for Inpatient Categories of Care

Suggested Citation

Wagner T, Lo J. HERC's MCA Discharge Dataset with Subtotals for Inpatient Categories of Care. Guidebook. Menlo Park, CA. VA Palo Alto, Health Economics Resource Center; September 2017.



All tables for the Managerial Cost Accounting (MCA) Discharge Dataset with Subtotals for Inpatient Categories of Care guidebook are saved in an Excel file. Download the tables here.

Many URLs are not live because they are VA intranet-only. Researchers with VA intranet access can access these sites by copying and pasting the URLs into their browser.

For a list of VA acronyms, please visit the VA acronym checker on the Internet at https://www.va.gov/ORO/Acronyms.asp or VA intranet at http://vaww.va.gov/Acronyms/fulllist.cfm.

1. Overview

The Managerial Cost Accounting System (MCA; formerly Decision Support System (DSS)) maintains National Data Extracts (NDE) that track cost and utilization for care provided by the U.S. Department of Veterans Affairs (VA) medical centers. Researchers can access these files, which are stored as structured query language (SQL) datasets at the VA Corporate Data Warehouse (CDW).

The MCA Discharge (DISCH) NDE includes information on the entire span of an inpatient hospitalization. It provides the discharge bed section (DBEDSECT), but does not have detailed information on other treating specialties during an inpatient stay. A bed section, also known as the treating specialty, is a two-digit code developed by VA to characterize the type of hospital care patients receive.

If a researcher is interested in information on specific treating specialty segments of an inpatient stay, then the researcher must extract the information from the MCA Treating Specialty (TRT) NDE. To expedite this process, we created a new dataset beginning fiscal year (FY) 2007 that is identical to the DISCH with the exception of additional fields containing cost and length of stay subtotals for each inpatient category of care (e.g., acute medicine, psychiatry, nursing home, etc.). The categories of care represent our groupings of common bed sections as reported in the TRT.

This guidebook describes how we prepared the HERC Discharge dataset with cost and length of stay (LOS) subtotals for each inpatient category of care. Chapter 2 describes the methods we used to merge the DISCH and TRT. In our comparison of the two files, we found that the vast majority of the records had equivalent total costs and lengths of stay. There were few cases where this was not true. Chapter 3 describes our methods for dealing with the small number of records with cost and length of stay differences between the DISCH and TRT, while Chapter 4 discusses important information regarding the use of the new Discharge dataset.

For more information on these MCA NDEs or other available NDEs, see the current MCAO National Data Extract (NDE) Technical Guide (http://vaww.dss.med.va.gov/nationalrptg/nr_extracts.asp) or the HERC Research Guide to Decision Support System National Cost Extracts (https://www.herc.research.va.gov/include/page.asp?id=guidebooks#DSS).

1.1. Updates

FY2016 update

In FY16 there are some discrepancies in the raw length of stay variable and in the national daily average cost for unidentified care (category 999). These descrepancies are detailed in section 2.2.4.

FY2015 update

In FY15 we noticed some discrepancies in the raw length of stay variable. These descrepancies are detailed in section 2.2.3.

FY2013 update

The FY 2013 update of this guidebook shows all variables as SAS/SQL. Variable names will be listed as “SAS name/SQL name.” For the crosswalk of SAS to SQL variables, see the MCAO NDE Layout Specifications Excel sheet on the MCAO website at http://vaww.dss.med.va.gov/nationalrptg/nr_extracts.asp. Note: The most recent year of MCA SAS files is FY 2012. Data for FY 2013 and forward are in SQL format only.

FY2012 update

The FY 2012 update of the guidebook includes a new section (3.2.3) on changes to variables in the FY 2012 DISCH and TRT.

2. Merging the Discharge (DISCH) and Treating Specialty (TRT) NDEs

The DISCH file includes expenses associated with a completed inpatient encounter and is reported in the fiscal year of discharge. Each record represents a unique hospital stay. Consequently, the DISCH may include information from multiple fiscal years. The TRT, on the other hand, corresponds to a patient’s duration in a treating specialty segment. Each segment is defined either by a change in treating specialty, or for each complete month if no change in treating specialty took place. For example, a veteran who is hospitalized in neurology from April through May will have two records. Therefore, multiple TRT records can exist for a given admission because patients move to different treating specialties during their hospitalization, or they stay past the end of the month in the same treating specialty. TRT has information on the hospital care provided in the current year, regardless of whether the patient was discharged. Information on stays that span multiple fiscal years is in multiple TRT files.

2.1. Categories of Care

Thirteen categories of care were created based on the treating specialty variable, TRTSP/TXSP (SAS/SQL), in the TRT dataset. Each category represents common groupings of bed sections, but is not completely consistent with the HERC Inpatient Average Cost categories. The HERC Discharge has a separate category for intensive care unit (ICU) and unidentified treating specialties (see Table 1).

2.1.1. Change to Treating Specialty Variable Type

Beginning FY 2007, the treating specialty variable, TRTSP/TXSP, was converted from numeric to character variable type. This change was made to accommodate new alphanumeric treating specialty codes (refer to Table 1). TRTSP_C, the former character variable type for treating specialties, has been dropped because it is no longer needed.

2.1.2. Assumption Regarding Missing Treating Specialties

Typically a diminutive amount of TRT records (approximately .3% of all TRT records from the same fiscal year) have missing treating specialties representing valid costs and utilization. In the past, we were unable to ascertain the treating specialty by linking the TRT with the Patient Treatment File (PTF) Bed Section file. However, we still required a technique that could easily be replicated for future versions of the HERC Discharge dataset. Therefore, we resolved this issue by assigning missing treating specialties to Acute Medicine since over 50% of inpatient care provided by VA is in this category. And because we are dealing with so few observations, it would not make a large impact on subtotal summarizations.

2.2. Adding Category Costs and Length of Stay to the DISCH

Costs and lengths of stay were calculated for each category and summed across unique combinations of scrambled social security number (SCRSSN/SCRSSN), station (STA3N/STA3N), admission date (ADMITDAY/ADMITDAY), and encounter number (ENC_NUM/ENCNO). These variables were used to identify a unique hospital stay. When we merged the DISCH file with the summarized TRT dataset by these key variables, nearly all the records matched perfectly. The remaining few unmatched DISCH records were found to have zero costs and missing encounter numbers even though the lengths of stay were greater than zero. Because no ENC_NUM/ENCNO value was available for these DISCH records that could not be matched to a unique hospitalization in TRT, we used a SAS DATA step to perform another merge on these few records. The BY variables we used were SCRSSN/SCRSSN, STA3N/STA3N, and ADMITDAY/ADMITDAY (excluding the ENC_NUM/ENCNO variable that we used in the first merge at the beginning of this section). After this stage, no DISCH records remained unmatched to a summarized TRT record which included subtotals for categories of care.

From FY 2009 to FY 2012 we did not encounter any issues with merging the DISCH and summarized TRT files using SCRSSN/SCRSSN, STA3N/STA3N, ADMITDAY/ADMITDAY, and ENC_NUM/ENCNO. All DISCH records merged after the first pass.

Since FY 2013, some records in the DISCH did not merge. We added those records back into the dataset and created a flag variable (FLAG_EXTRA) to assist future 1:1 merges.

2.2.1. Changes to Inpatient Variables in FY2008

In FY 2008, the total DISCH cost variable, DCST_TOT/TOTCOST, as well as the total TRT cost and length of stay variables, TCST_TOT/TOTCOST and TRT_LOS/LOS, respectively, were dropped. DSO has already corrected this in the FY 2009 core NDEs. To calculate the total costs in FY 2008, we summed all the fixed direct, fixed indirect, and variable direct subtotals. We derived the length of stay in TRT by using the following algorithm if TRTIN/TXSPSDT contained a non-missing value: max(TRTOUT-TRTIN,1). In other words, we calculated the length of stay by subtracting the day the patient entered the treating specialty (TRTIN/TXSPSDT) from the day the patient exited the treating specialty (TRTOUT/TXSPEDT).

2.2.2. Changes to Inpatient Variables in FY2009

Since FY 2009, the DISCH and TRT total cost variables were included in the NDEs. However, the TRT length of stay variable, now called LOS/LOS instead of TRT_LOS, has missing values for all records. We calculated the length of stay using the following formula that we employ several times throughout the construction of the HERC Discharge file: max(TRTOUT-TRTIN,1). However, we found some cases where the TRTIN/TXSPSDT variable, or treating specialty start date, was missing. To reconcile these records, we either took the first day of the TRTOUT/TXSPEDT month or the admission date, ADMITDAY/ADMITDAY, whichever occurred second. There were no cases of missing TRTOUT dates. Below are two examples detailing how we determined what the TRTIN/TXSPSDT dates would be after finding they had missing values.

Example 1

If a patient was admitted to a VA hospital on 7/22/2009 and exited the treating specialty on 7/31/2009, then the treating specialty start date would be coded as 7/22/2009.

Example 2

If a patient was admitted on 3/27/2008 and had an exit date recorded as 4/30/2008, then the new entry date would be 4/1/2008 because the first of the month occurs after the admission date. As described at the beginning of this chapter, each TRT record is represented either by a change in treating specialty, or for each complete month if no change in treating specialty took place. If this patient did not change treating specialties, the dates tell us there should be at least two TRT records for this encounter, one from 3/27/2008 to 3/31/2008 and one from 4/1/2008 to 4/30/2008. It is possible that the patient’s stay extended past 4/30/2008.

2.2.3. Changes to Inpatient Variables in FY2015

Utilizing the FY15 MCA data, we discovered there were 141 records where the raw length of stay (LOS) did not equal the max of (DISDAY-ADMITDAY) or 1.

We followed up with MCA staff to report this issue. Per MCA staff, the discrepant records were due to the MCA production database, the source for the NDE, which needed correction. Updates to programming have been implemented for the FY17 datasets. Since these changes affect the production databases the local site teams use for monthly processing, they were unable to make updates to prior year datasets. Only data resulting from current year processing were corrected and prior year processing was closed.

We have left the discrepant records as-is in our resulting dataset.

2.2.4. Changes to Inpatient Variables in FY2016

Utilizing the FY16 MCA data, we discovered 146 records where the raw length of stay (LOS) did not equal the max of (DISDAY-ADMITDAY) or 1. Per our discussions with MCA regarding similar discrepancies in FY15 data, we have left the discrepant records as-is in our FY16 dataset (see section 2.2.3 for more details).

In creating Table 3 (National Daily Average Cost for each Category of Care, by Fiscal Year), we found that Unidentified Care (Category 999) increased drastically from FY15 to FY16. Upon further investigation we found the increase is due to 1,054 records at station 673 with an unidentified treating specialty. We contacted MCA regarding the records. Per MCA staff, there was a problem with the admissions in the VistA system and they were not assigned a treating specialty. This problem was fixed in the site's VistA system in FY17. We left the discrepant records as-is in our FY16 dataset.

2.3. Definitions of Key Variables for DATA Step Merge

The DISCH and TRT contain information about VA hospitalizations. The variables below uniquely identify a stay and were used to link the DISCH and summarized TRT files using a MERGE statement in the SAS programming language.

  • Scrambled social security number (SCRSSN/SCRSSN). A unique patient identifier formatted to resemble a social security number.
  • Station (STA3N/STA3N). A three-digit code representing the VA medical facility where the patient received care.
  • Admission date (ADMITDAY/ADMITDAY). The date of admission into a VA medical facility. This is different from the TRTIN/TXSPSDT variable in the TRT NDE, which represents the date of entry to a treating specialty.
  • Encounter number (ENC_NUM/ENCNO). An encounter number is a character string consisting of admission date (YYMMDD), the letter “I”, and possibly a sequence number depending on the number of admissions on the same day. A second admission on the same day creates a suffix of “1,” a third admission on the same day creates a suffix of “2,” and so on. For example, if a patient was admitted to a VA hospital on 1/20/2007, discharged and moved to a different facility on the same day, there would be two encounter numbers for the two separate admissions, 070120I and 070120I1, respectively. If the patient did not move between facilities and only had one admission, then only 070120I would be listed.

2.4. Including TRT Files from Prior Years

When a hospital stay crosses multiple fiscal years, it is characterized by a single DISCH record in the year of discharge and many TRT records over two or more fiscal years. Therefore, we aggregated TRT datasets from FY 2000 (earliest available) onward to account for information not found in the TRT file for the year of discharge. However, TRT files from certain fiscal years do not contain the variables needed to merge with the DISCH file. A list of variables with missing information is presented below.

  • Encounter numbers. Valid encounter numbers, represented by the variable ENC_NUM/ENCNO, only exist for FY 2004 and FY 2006 onward. We defined “valid” to mean those with a 7 or 8 character string consisting of a date (YYMMDD), the letter “I”, and possibly a sequence number. We found that all values in the ENC_NUM/ENCNO field were zero in FY 2003.
  • Length of stay. We calculated treating specialty segment lengths of stay (TRT_LOS/LOS) for FY 2000-2004, 2008, and 2009 because no equivalent variable was available for those years. We used the following algorithm to calculate length of stay: max(TRTOUT-TRTIN,1)[1]. In other words, if the exit date minus the entry date equals zero, then TRT_LOS/LOS would equal 1 since an inpatient stay in a treating specialty is at least one day.
  • Discharge date. Discharge date (DISDAY/DISDAY) was not included in TRT until FY 2007. We did not use this variable to merge with the DISCH file.

After merging the DISCH and summarized TRT with subtotals, we compared the totals costs and lengths of stay from each file. Consistently over the years, almost 100% of all records had equal or nearly equal costs, while almost 99% had equal or nearly equal lengths of stay. Chapter 3 discusses in detail the methods we used to reconcile the small number of differences.

3. Methods for Reconciliation

3.1. Cost Reconciliation

Information on each hospital stay from the DISCH and TRT was combined into a single file that had complete data from both datasets. The DISCH identified the total cost for the entire stay, as well as total length of stay. The TRT file identified the cost of care and length of stay in each hospital bedsection. We compared the cost reported in the two files to check that the merge was correct and that the data sources were consistent. Nearly all stays (over 99.9%) between the DISCH and TRT had equal costs, or cost differences of less than $100 which we term as “equal”. The following sections discuss in detail the methods we used to reconcile these records that represent less than .1% of all discharges in the fiscal year.

3.1.1. Cost Method Variable, CST_METH/COST_METH

CST_METH/CST_METH is a categorical variable we created to flag records based on the type of reconciliation used when there were varying cost differences. Most of the records were assigned a value of “0” because no reconciliation was required. In other words, the DISCH cost and the total TRT cost (sum of category costs) differed by less than $100 (nearly all differed by less than $10). Records reconciled using a proportion-based method described in section 3.1.2 were assigned “1”. A value of “2” was given to records we reconciled using national average daily costs as described in section 3.1.3. Finally, a value of “3” was assigned to a handful of records with incomplete data. The subtotals for these records reflect the information we have from FY 2000 (earliest available TRT NDE) onward. Because these admissions occurred prior to FY 2000, you will find that the total DISCH cost is always greater than the sum of costs for all categories of care. Please refer to Table 2 below for the number of records assigned to each value of the cost method variable, CST_METH/CST_METH. In Table 2 you can see that the accuracy of the data get better every year as fewer records require reconciliation.

3.1.2. Proportion-Based Method for Cost Reconciliation

In FY 2007, we used a proportions-based method to estimate subtotals for records where patients received care at stations 618 (Minneapolis, MN) or 679 (Tuscaloosa, AL). Our findings showed that records from these stations not only had large cost differences, but that the stays also began and ended in FY 2007. According to DSO, there was a costing problem just before the end-of-year NDE’s were created that only affected these two stations. The DISCH dataset provides the correct total costs. To reconcile the differences among these few records, we calculated the proportion of each category cost from the total TRT cost and then applied this proportion to the total DISCH cost to generate new subtotals. All records resolved with this approach were flagged in CST_METH/CST_METH with “1”.

Example of Using Proportion-Based Method

A TRT record consists of three category costs: $500 in category 1, $750 in category 4, and $250 in category 5. If the DISCH cost was $2,000, we would multiply $2,000 by 33.3% (500/1,500) to get a new category 1 cost of $666.67, 50% (750/1,500) to get a new category 4 cost of $1,000, and 16.7% (250/1,500) to get a new category 5 cost of $333.33. The sum of these new category costs now equal the total DISCH cost of $2,000.

Starting FY 2008 onward, the proportion-based method was not employed because there were no records with large cost differences when both admission and discharge occurred in the same fiscal year. Therefore, in Table 2, CST_METH=1 has a zero count from FY 2007-FY 2012.

3.1.3. Applying National Average Daily Costs

Of the records with cost differences greater than $100 and admission dates in FY 2000 through FY2012, we could not determine whether the DISCH cost or summarized TRT cost was correct. Therefore, our resolution was to apply national average daily costs (see Table 3). These costs were generated from the TRT NDE. The only modification made to the original dataset was the inclusion of category of care costs and lengths of stay. Once the national daily average costs were calculated, we multiplied these values by the length of stay[2] for each category at the stay-level. The results were then adjusted by the ratio between the total DISCH cost and the new total TRT cost. Records reconciled in this manner were flagged with a number “2” in the cost method variable, CST_METH/CST_METH.

3.1.4. Records with Incomplete Data

Records with admissions prior to FY 2000 were flagged with a “3” in CST_METH/CST_METH. Category costs associated with these records are incomplete and do not reflect the entire stay because data only exist for FY 2000 onward. Note these records generally represent hospitalizations in VA nursing homes spanning almost a decade on average. Therefore, you will find significant cost differences when compared to the total DISCH cost for the stay.

3.2. Length of Stay Reconciliation

Similar to the cost comparison, we examined the length of stay differences between the DISCH and summarized TRT file. Over the years, 99% of the records consistently did not require reconciliation because the lengths of stay were equal between the two datasets. We defined “equal” to be a difference of 5 days or less.

3.2.1. Length of Stay Method Variable, LOS_METH/LOS_METH

LOS_METH/LOS_METH is a categorical variable we created to flag records based on the type of reconciliation used for length of stay differences between the DISCH and summarized TRT file. Almost all the records were assigned a value of “0” since no reconciliation was required. A value of “1” was given to records where the DISCH length of stay was missing and, therefore, was calculated based on the formula described in the next section. We reassigned length of stay to about 1% of the records. These observations were assigned a value of “2” in the LOS_METH/LOS_METH variable. Lastly, records reconciled with the proportion-based method received a “3”. See Table 4.

3.2.2. Records with Missing DISCH Lengths of Stay

There are usually a handful of records where the DISCH length of stay variable, LOS/LOS, has missing values. In FY 2007, all utilization for these records occurred in acute medicine (category 0). Because these records only required one subtotal, we calculated the length of stay and assigned this value to the length of stay variable for acute medicine, LOS0. The length of stay calculation was based on the following formula: max(DISDAY-ADMITDAY-AGGABS,1)[3]. In other words, if the discharge date minus the admission date minus the aggregate absent days equaled zero, then LOS0 would equal 1 since an inpatient stay is at least one day. We gave these records a flag of “1” in LOS_METH/LOS_METH. Missing values in the DISCH variable, LOS/LOS, were not changed to reflect the new length of stay calculation. The reason we did not assign LOS0 the TRT length of stay total was because these values appeared to be inaccurate.

Beginning FY 2008, some of these encounters with missing DISCH length of stay had utilization in more than one category of care (i.e., acute medicine, unidentified, or both). We used the method described earlier in this section to reconcile records with utilization occurring in only one category (acute medicine or unidentified). For records with utilization in both acute medicine and unidentified categories, the approach described above could not be used to reconcile these records. Instead, we employed the method described in Section 3.2.5.

3.2.3. Records with Missing TRT Lengths of Stay

In FY 2012 the TRT had many missing TRTIN/TXSPSDT and TRTOUT/TXSPEDT dates, so three steps were taken to fill in missing dates. First, we had many records missing only a TRTIN/TXSPSDT date. Since MCA creates a new record each time a patient transfers treating specialties within a single stay and creates a new record for each month of an inpatient stay, we set the TRTIN/TXSPSDT date equal to the TRTOUT/TXSPEDT date from the adjacent stay. Second, for stays that did not cross months but were missing a TRTIN/TXSPSDT and TRTOUT/TXSPEDT date, we set TRTIN/TXSPSDT equal to the ADMITDAY/ADMITDAY from the DISCH and the TRTOUT/TXSPEDT date equal to the DISDAY/DISDAY from the DISCH. Third, for remaining records missing TRTOUT/TXSPEDT we used fiscal period (FP) to get the actual month of discharge and used TRTIN/TXSPSDT as a proxy for the month and year of discharge (e.g., TRTOUT/TXSPEDT is the last day of the FP/FP month with the same year as TRTIN/TXSPSDT).

In previous years, we adjusted the total length of stay (“if totlos ne . then do; totlos = totlos + countrec – 1) to avoid over counting days within a stay. However, this code was not necessary this year given our changes to the TRTIN/TXSPSDT and TRTOUT/TXSPEDT variables.

Additionally there was one record with a length of stay greater than 1000 days and a missing treating specialty (TRTSP/TXSP). Because part of that stay was found to be in nursing home care, we categorized the entire stay as nursing home care (LOS9 and COST9).

3.2.4. Utilization in Only One Category of Care

There were other records with utilization occurring in only one category. These records; however, had non-missing DISCH lengths of stay. We assigned the DISCH length of stay to the category where utilization occurred. These records were given a value of “2” in LOS_METH/LOS_METH. We did not assign the TRT length of stay because the values appeared to be inaccurate based on our calculations. We also wanted to refrain from changing the original data in the MCA DISCH NDE.

3.2.5. Proportion-Based Method for Reconciling Length of Stay Differences

The remaining records (less than .1%) were reconciled using the same proportion-based method employed in section 3.1.2. Since utilization occurred across multiple categories, we could not simply reassign the DISCH length of stay. Using the proportion-based method, we rounded the lengths of stay to the nearest whole day and flagged them with the number "3" in the LOS_METH/LOS_METH variable.

4. Important Notes for Using the HERC Discharge Dataset with Subtotals

The HERC Discharge dataset is functionally identical to the original MCA DISCH file upon which it is based. No modifications have been made to the original data. We have only included new variables corresponding to cost and length of stay subtotals, as well as two flag variables.

4.1. Access to the HERC Discharge Dataset

The HERC Discharge dataset is now stored at VINCI and on the SAS Grid. Access to HERC Discharge data is governed by the VA National Data Systems (NDS). The most current information on the data request process can be found on the VHA Data Portal at http://vaww.vhadataportal.med.va.gov/. To gain access to the HERC Discharge data, follow the appropriate request process for operational or research access to MCA datasets at the VA Corporate Data Warehouse (CDW): http://vaww.vhadataportal.med.va.gov/DataSources/HERCCostData.aspx.

Once approved for access, the files can be found on VINCI and on the SAS Grid. Please note that data at VINCI are only available behind the VA firewall so users must have VINCI clearance in order to access the location. The SAS Grid can only be accessed from within a grid connection or from a Secure FTP client application.

4.2. New Variables Added to MCA DISCH File

HERC adds at least 28 new variables to the MCA DISCH NDE (created by VHA MCA) every fiscal year. These correspond to the 13 cost variables and 13 lengths of stay variables for each category of care, as well as the two flag variables indicating which method was used to reconcile differences. Subsequent variables are added on an as-need basis. For instance, in FY 2008, total costs were excluded from both the MCA DISCH and TRT NDEs. HERC calculated these variables and included them in the final dataset.

Special attention should be given to the cost method (CST_METH/CST_METH) and length of stay method (LOS_METH/LOS_METH) variables. Depending on the type of reconciliation we employed, researchers may want to exclude certain records from their analyses. For instance, an analyst may want to query records where CST_METH=0 and LOS_METH IN (0,2)[4] because no modifications were made to the cost and length of stay categories.

Researchers familiar with the HERC Average Cost datasets may know the distinction between local and national costs. MCA provides local costs in their datasets and; therefore, the subtotals we calculated in the HERC Discharge dataset should also be considered local costs. For information regarding local and national cost estimates, please refer to the guidebook, HERC’s Average Cost Datasets for VA Inpatient Care at https://www.herc.research.va.gov/include/page.asp?id=guidebooks#AC.

4.3. No Reassignment of Missing Values

Costs and lengths of stay have missing values in categories with no utilization. However, if there is utilization but no cost, then “$0.00” will appear in the category where the patient received care, while the other categories for that record will have missing values.

We did not assign zero to missing costs or lengths of stay for the following reasons:

  • There are some missing values for length of stay, LOS/LOS, in the DISCH file. As a result, we wanted to be consistent throughout the whole dataset.
  • It is easy to zero-fill missing values, but difficult to remove zero-fills.
  • Most importantly, we did not want to affect summarizations made by researchers using our dataset. For example, if we assigned zeros, then the average cost for a category of care would be lower than expected. If an analyst uses the MEANS procedure in SAS to summarize costs or lengths of stays, for instance, missing values would be excluded from such calculations.


We would like to thank the Managerial Cost Accounting Office (MCAO; formerly Decision Support Office (DSO)) for providing information and suggestions toward the creation of this guidebook. We would also like to acknowledge financial support from the VA Health Services Research and Development (HSR&D) and programming support from Jennifer Scott.

[1] Where TRTIN /TXSPSDT = entry date into treating specialty and TRTOUT / TXSPEDT = exit date from treating specialty.

[2] Note that these averages are based on unadjusted lengths of stay.

[3] Where DISDAY / DISDAY = discharge date from VA medical facility, ADMITDAY / ADMITDAY = admission date to VA medical facility, and AGGABS = aggregate absent days.

[4] This code uses the IN operator in the SAS programming language.


Last Updated Date: 2017-09-29