How Do You Know if Your Doctor Is Doing a Good Job?

February 13, 2023 Justin Moore

We’ve spent a lot of ink in this blog discussing how difficult it is to measure quality in the various US healthcare systems. One large-scale effort to measure quality is the “Medicare Merit-based Incentive Payment System,” or MIPS. MIPS is a big deal for health systems. Quality isn’t just for professional pride. The MIPS program has a significant impact on the reimbursement received by U.S. physicians.

Some of the surveys or questions you’ve undoubtedly had to answer in doctors’ offices the last few years are undoubtedly tied to their efforts to improve their MIPS score. MIPS rates physicians based on measures in four categories:

Quality (30% weight), mostly in terms of clinical outcomes and patient experience. Doctors might be scored on the percentage of hypertensive patients who have their blood pressure controlled or the percentage of their patients who report a high level of satisfaction with their care.
Promoting interoperability (25% weight), how well a physician uses technology to improve the quality and efficiency of their care. Measures in this category might include the percentage of patients using the electronic health record (EHR) portal or how many prescriptions are sent to the pharmacy electronically.
Improvement activities (15% weight), how well a physician is working to improve her practice through activities like quality improvement programs.
Cost (30% weight), how much a physician’s care costs compared to his peers. Think: the number of seemingly unnecessary tests and procedures ordered.

Because the work that, say, a psychiatrist does is so different from the work a urologist does, doctors who participate in MIPS may choose six of a possible 257 performance measures to report, only one of which must be an “outcome measure,” such as hospital admission for a particular illness. The others can be “process measures” like rates of cancer screening. Docs are given a composite MIPS score between zero and 100. To avoid a “negative payment adjustment,” (that is, a reduced fee) physicians must score >75, which seems high to me unless I frame it as a solid “C” grade. Also, 86% of the docs in the sample achieved at least that score, indicating that they either are good at gaming the system or that the score isn’t terribly difficult to achieve.

In spite of the massive effort put into MIPS by regulators, docs, and health systems, it’s unclear whether the MIPS program really reflects the quality of care provided by participating physicians. To investigate, investigators analyzed 3.4 million patients treated in 2019 by 80,246 primary care physicians using Medicare datasets (paywall). They looked specifically at five “process measures” like rates of diabetic eye examinations and breast cancer screens and the “patient outcomes” of all-cause hospitalizations and emergency department visits.

They found that physicians with low MIPS scores (<30) had worse performance on three of the five process measures compared to those with high (>75) MIPS scores. Specifically, the low-scoring docs had lower rates of diabetic eye exams, HbA1c screening for diabetes, and mammography for breast cancer screening. However, the lower-performing docs had better rates of flu vaccination and tobacco screening. In the “patient outcomes,” there was no consistent association with MIPS scores: emergency department visits were lower (e.g., better) for those with low MIPS scores, while all-cause hospitalizations were higher (worse).

Overall, these inconsistent findings suggest that the MIPS program may not be an effective way of measuring and incentivizing quality improvement among U.S. physicians. The “patient outcomes,” which I think most of us would be most interested in, showed no clear association with MIPS scores. In addition, the study found that some physicians with low MIPS scores had very good composite outcomes, while others with high MIPS scores had poor outcomes. Like every correlative study, there were outliers. This suggests that there may be other, more nuanced, factors at play that are not captured by the MIPS program that influence a physician’s performance.

The study is recent enough that we don’t have peer-reviewed criticism or hypothesizing yet about the potential mechanism of MIPS failure. But a blog post from Cornell puts it this way: “…there is inadequate risk adjustment for physicians who care for more medically complex and socially vulnerable patients and that smaller, independent primary care practices have fewer resources to dedicate to quality reporting, leading to low MIPS scores.” So, sicker patients going to smaller, independent practices may drag down results. Put another, more frank, way from Dr. Amy Bond in the same blog post, “MIPS scores may reflect doctors’ ability to keep up with MIPS paperwork more than it reflects their clinical performance.” For our comrades in Human Resources, I suspect this criticism rings especially true.

As the Medical Director of the Kansas Business Group on Health, I’m sometimes asked to weigh in on hot topics that might affect employers or employees. This is a reprint of a blog post from KBGH.