« What You Need to... | Main | Everyday Heroes:... »

Monday Aug 15, 2016

Swing and a Miss: Do Quality Measures Have Dubious Value?

In 2006, Tampa Bay Rays executive James Click wrote an essay in the book Baseball Between the Numbers about how light-hitting backup catcher Mike Redmond fared so well against Hall of Fame pitcher Tom Glavine. During his 13-year career, Redmond had a respectable batting average of .287, but he didn't play regularly because of his mediocre defense and lack of power.

Glavine, on the other hand, spent the majority of his 22-year career as one of the best pitchers in Major League Baseball, twice winning the Cy Young Award as the best pitcher in the National League.

Redmond faced Glavine 51 times before retiring after the 2010 season, and he got a hit nearly 44 percent of the time. Not only that, he displayed more power against Glavine than he typically did against any other pitcher. In other words, when this marginal batter faced this all-time great hurler, Redmond seemed to turn into a modern-day Babe Ruth.

So what did Redmond know about Glavine that other hitters didn't? Nothing. This was a statistical anomaly, a function of small sample size. Redmond didn't face Glavine enough times to accurately say that he had any true advantage other than luck. It's possible that something gave Redmond some benefit against Glavine, but they didn't face each other enough for us to say that with certainty.

What does this anecdote have to do with health care? We have to be careful about what we assertively call "quality." For example, in my practice, I have a panel of roughly 600 patients. This is smaller than a typical family physician's panel because a significant part of my time is spent teaching medical students and residents, and I also have various administrative duties. Of these 600 patients, a little more than 50 have type 2 diabetes. With the various quality metrics that assess the care I provide to individuals with diabetes (blood pressure management, hemoglobin A1c testing and results, eye exams, foot exams and kidney disease screening -- all per the recent core quality measures from CMS), can we be certain that the sample size of my patient panel is large enough to rule out a statistical oddity in my "quality?"

Do we even know what the appropriate sample size is for reliable measurement of my quality? Not that I can find.

Spencer Nam, an analyst at the Clayton Christensen Institute for Disruptive Innovation, has pointed out that the measurement of quality in other industries focuses on process, but this is not how we measure it in health care.

"Relying on metrics when there is no agreement on a standardized process causes some of the patient care decisions to be made based on reimbursability, while adding administrative responsibilities to track measures that are irrelevant to patient outcomes," Nam said. "Instead of improving efficiency and effectiveness," not to mention the true quality of care provided, "these metrics become extremely burdensome to the system."   

So if the sample size is not standardized and the metrics themselves may not be true measures of quality, what do we do? The examples of disruptive models that Nam discusses, such as Geisinger Health System and Intermountain Healthcare (as well as, theoretically, accountable care organizations),  focus their quality measures on a broader scale. Although these organizations still take some account of quality as measured for individual physicians and other health care professionals, it is the quality measures of health care teams and care processes that actually make the difference.

If we rely on teams to care for patients and populations, then it makes no sense to measure quality for a single physician. There are others in my clinic who care for patients with diabetes, but we all use the same support teams to deliver the essential services to these patients. Shouldn't the clinic's overall team be measured instead of the individual clinician? I think so, but we're not there yet.

So how do we know if I'm like Mike Redmond, the mediocre player who was lucky against top-notch talent, or Tom Glavine, the ace who was dominated in a small sample? The best way to accurately answer that question is to see which team actually won the most.

Kyle Jones, M.D., is a faculty member at the University of Utah Family Medicine Residency Program in Salt Lake City. He is the director of primary care at the Neurobehavior HOME Program, a patient-centered medical home for those with developmental disabilities. You can follow him on Twitter @kbjones11.


RAND did the consulting for CMS on MACRA and told them that the measures were not valid for the low numbers claimed, but CMS went ahead and claimed validity and reliability. Only innovation advocates, those isolated from the reality of US people and health care, or the biased fail to understand that health outcomes are mostly about people factors with clinical interventions at best 10% of influence. This leaves quality metrics as incapable of rising above the noise, the people factors, the genetics, and other influences. Even when you can best assign a physician to a procedure and outcome (NY CABG rating) and have better to best data, analysis, experience, and reputation, it is not possible to discern 80% of physicians from average (from The Health Care Blog) What we know from over 14 studies is that pay for performance in all of its various forms ends up penalizing those who care for patients with people factor issues - behaviors, situations, environments (Sullivan review, THCB). Hong demonstrated this in JAMA with CHCs as representative of underserved, and adjustments could not be made for disadvantage. This relationship is so strong that you can predict errors in published or promoted studies when there is a difference found (PCMH, Rural vs Urban, MD vs NP, teaching hospital before and after resident work hours). We know this for hospitals that are penalized because they care for rural, underserved, or disadvantaged patients that have lesser outcomes. Message to digital manipulators - Learn from WarGames. Set Players = Zero for WOPR to learn that you cannot win with quality metrics. The only way to win is not to play at all. Family physicians understand that people have far too many dimensions and too many differences in each dimension to predict much of anything - other than the widening divisions that we see. The payment design paying less for cognitive, office, and basic services has long contributed to disparities. The innovative designs have made this worse not to mention other consequences in cost of delivery acceleration and productivity declines due to quality metric regulation. We are not against quality. We are against bad science and discrimination by design. We are for investments in the team members to deliver the primary care, mental health, and mental health in primary care (50% of mental health). We are tired of sending talent, time, and treasure everywhere else.

Posted by Robert C. Bowman, M.D. on August 15, 2016 at 10:45 PM CDT #

Great posts.

Those who echo the message in this articles and comments, in my opinion, are those who have to be responsible for the final outcome of a patient, or a patient base, or a community, or a family of patients. As Michael Jordan could tell you, it takes a team to win a game, not a superstar, nor a bundled group of superstars (Lakers tried it and failed). These front liners are dependent on a team, coordination, and processes. They are the doers as well as the coordinators, and silent leaders in their community. As for the Lakers, great players weren't enough. It took a great coach (Phil Jackson), aka physician leaders as well.

Those who don't echo this message entirely are not concerned with the totality of patient care, nor its longitudinal investments. They may be just interested in some "point-of-care," like selling a car, a package with GPS, turbo, and Bluetooth. I am old enough to remember a time when electric windows were a big selling point and eight-track cassette audio players. It's a persuasion to buy, and buying means profits.

"We are not there yet." I agree. HbA1c, Patient Satisfaction Scores, it's not everything that healthcare is and could be nor should be, but they do sell, and make it easier for a consumer-based, patient-centered market to "buy," for employers to understand and "select." Time studies are falsely being used to sell "access," while in reality, no visit is ever just "15-minutes."

If this is the current force in the "market," then it is cross-sectional and not longitudinal. Perhaps this is why the healthcare market is so volatile. It IS about numbers. It's more volatile when you consider that a major force behind it is because of technology. Technology changes faster than people can keep up, and given a choice between harnessing the current power of technology (mostly databases) versus human capital to gain a market advantage, the leaders of our current times mostly opt for the former if you just look at the dollars vested in each category. Technology will help with public healthcare policies, but not everything.

Speaking of HbA1c, I got upset a few years back upon learning that the "evidence-based" guidelines have just raised the HbA1c metrics for the elderly as standards of care, citing hypoglycemia and cardiac complications from aggressively tightening the blood sugar range. The collateral damage was already done - many bonuses and even partnerships were denied based on faulty past metrics, in retrospect, weren't evidence-based at all. They were mostly industry-set standards. They used to post these numbers on the hallway en route to the nurses lounge and ranked them. Sadly, many names are no longer on the list and the list changes all the time. Quality must be longitudinal, not cross-sectional, and it's not just about the "n" power. We all know that. It takes years to complete Phase I, II, III for that reason, for the complexities that Dr. Bowman had cited.

Unfortunately, the market is very aggressive, and aggressive "dial manipulators" tend to take short-cuts to make a "selling point." There's a lot of middlemen-women involved who don't have to be responsible for the final outcome, merely just "delivery." In a hyper-competitive market, this is even worse. Timing is everything.

Posted by Michael N., MD. on August 17, 2016 at 10:35 AM CDT #

One would think that the goal of monetizing all aspects of primary care is to make money for big box players, whose quality is suspect to those with access to their data. This theory is trivial, even though the big boxes can manipulate their data by cleaning charts, something we in smaller shops can't. Think Volkswagen.
The real objective is to have clear, clean balance sheets so that major systems can leverage bonds to buy more systems. The real money is in selling collateralized obligations on these bonds--like real estate in 2007-2008.
BUT--for our patients and communities, three particulars
1-published data on confidence limits and sample sizes--for most measures, one needs 200 providers to show statistical reproducibility.
2-ethics--do you tell your patients with diabetes YOU are paid more if they take their statin? Even if 'statins in diabetes' is a soft evidence guideline?
3-No one in real life has shown benefit for 'value based payment' No One! It's been tried
Killing independent practices, which provide over 40% of primary care, would break the primary care system for the sake of big box corporate welfare. Our communities would suffer.

Posted by Bob Morrow on August 18, 2016 at 09:18 AM CDT #

You must be logged in to post a comment. Login

Sign Up

Subscribe to receive e-mail notifications when the blog is updated.


Our other AAFP News blog

Leader Voices Blog - A Forum for AAFP Leaders and Members


The opinions and views expressed here are those of the authors and do not necessarily represent or reflect the opinions and views of the American Academy of Family Physicians. This blog is not intended to provide medical, financial, or legal advice. All comments are moderated and will be removed if they violate our Terms of Use.