Risk Adjustments for Individuals and Groups

A new report was recently released by CMS that measures the accuracy of the risk adjustment models that CMS utilizes for Medicare Advantage plans, the Physician Group Practice demonstration and others, and is expected to be used for Accountable Care Organizations (ACOs). This report provides some interesting insights into the effectiveness of risk adjusters. It is entitled “Evaluation of the DMS-HCC Risk Adjustment Model” and can be downloaded from here: http://www.hccblog.com/wp-content/uploads/2011/05/Evaluation_Risk_Adj_Model_2011.pdf
The report describes the history of the Medicare risk adjustment process and the evolution of the various risk measurement models. It describes the adjustment calculation process itself and gives examples of those calculations. It also describes the inclusion of frailty adjusters and the adjustments that are made for institutionalized members.
The primary objective of the report is to show the accuracy of the prediction of future costs based on the hierarchal condition categories (HCC) scores of groups of patients. Two metrics are discussed for showing prediction accuracy – the “prediction ratio” and the “R2” factor. The prediction ratio applies to groups of patients and is the ratio of the aggregate predicted cost to the aggregate actual cost. A perfect model would result in a prediction ratio of 1; a ratio greater than 1 would indicate that the aggregate predicted cost exceeded the aggregate actual cost for the group. By contrast, the R2 factor indicates the correlation between the predicted and actual costs for individual members.
These factors don’t necessarily correlate to each other – a model can have a prediction ration that’s close to 1 while having a low R2 value; in fact those results were found frequently by the researchers. This distinction is important since it’s tempting to attempt to predict individual costs from risk scores, and questioning the overall accuracy of the prediction model when those scores end up being inaccurate. The researchers discuss this issue early in the article (page 5) and explain the reasons that risk adjusters, while not accurate for prediction of costs of individuals, are nonetheless accurate for groups.
This may initially be counterintuitive. Why would the sum of inaccurate measurements lead to an accurate total? Well, consider the task of predicting the cost of heart attacks in a population. While a heart attack can’t be predicted for a specific individual (at least from the data available to CMS), historical data might show that one of every 200 people with hyperlipidemia will have a heart attack in a given year. (That factor is hypothetical – it may be wildly inaccurate.) The HCC grouper software would therefore include 5% of the cost of a heart attack into the cost of every hyperlipidemic. This would cause the individual member’s cost to be inaccurate, yet the total would now include the correct predicted cost.
Most of the report discusses the accuracy of the CMS HCC risk adjustment model as applied to various subsets of the Medicare population. The subsets were stratified by various chronic diseases, interactions between diseases, and cost deciles within each of those groups. Particular attention was paid to chronic conditions special needs plans (C-SNP) patients, as defined by groups of HCCs. This analysis may be of particular interest to ACOs, which are expected to focus on managing the care of this costly segment of the population.
In general the risk adjustment model predicted costs with a high degree of accuracy, resulting in prediction ratios of 1.000 for alcoholism, cancer, autoimmune disease, heart failure, diabetes, end stage liver disease, HIV/AIDs and chronic and disabling mental conditions. (Achieving a prediction accuracy of 1.000 seems virtually impossible, but we have no basis to question these results.)
By contrast, the model underpredicted costs of dementia by almost 15%, hematological and neurological disorders by about 7%, and cardiovascular and lung disorders by 3% and 2% respectively. It never overpredicted costs in these groups. The sample sizes in these groups varied wisely; the cardiovascular disorders group contained more than 500,000 members while the end-stage liver disease group contained fewer than 3,000.
Each of the C-SNP groups was further stratified by expenditures by patient, separating the population into ten deciles, each containing 10% of the members ranked by their total costs. The objective of this analysis was to ascertain the predictive accuracy of the model on lower or higher-cost patients. There was significant variation in this analysis; for example the model slightly underpredicted the costs of extremely low-cost and high-cost diabetics, while overpredicting the cost of moderately-costly diabetics. It significantly overpredicted the costs of lower-cost HIV/AIDS patients, while significantly underpredicting the costs of higher-cost HIV/AIDS patients.
What does all of this mean? Well, as noted in previous Single Tracks blogs (Risk Scoring and the Luck of the Draw), it shows that the HCC model may under or over-predict costs of groups of patients, and the accuracy of prediction decreases as the size of the group decreases. The analysis was also performed on the Medicare 5% sample population, which may not be representative of your particular population. Therefore, it’s critical for an ACO to compute its aggregate risk scores relative to its costs to ascertain how its costs compare to the predicted costs, with the objective to identify any inherent bias in the measurement for your population.
From the draft ACO regulations released in March 2011 it’s still not clear how the risk adjustment will be performed, so continuing attention to this issue is warranted. In addition, although the prediction accuracy of costs for individual patient is not high, it seems worthwhile to compare individual risk scores with member actual costs and review any significant disparities. While a high degree of correlation is not expected, patients with costs that are significantly in excess of the predictions may be being overtreated, while patients whose costs are significantly lower than predictions may not be receiving proper care. In this case the predicted costs are being uses for patient management, not financial reporting, and should yield useful results.