The Promise and Peril of Performance Measurement

Few would dispute the assertion that our behavioral healthcare system and the many institutions on which it depends are in a state of transformation, if not upheaval. This transformation is characterized by many overarching themes and trends, most of which aim to enhance the quality of care delivered to recipients of healthcare services and at lower costs. The Institute for Healthcare Improvement envisioned nothing less when it introduced the “Triple Aim” of healthcare reform in 2007, an ambitious plan that added improvement in overall population health to its goals for the new millennium (American Hospital Association, 2015). Subsequent developments, including the enactment of the Patient Protection and Affordable Care Act (ACA) and establishment of the Medicaid Redesign Team (MRT) and Delivery System Reform Incentive Payment (DSRIP) program, readily adopted the Triple Aim and its corollary initiatives. Foremost among these is the replacement of Fee-for-Service reimbursement systems with Alternative Payment Models (APMs) that recognize and reward quality in service delivery. Behavioral healthcare providers (and the many Community Based Organizations (CBOs) that provide ancillary support services to individuals with behavioral health needs) can no longer rely on payers to compensate them simply for the volume of services delivered. As envisioned in the New York State Roadmap for Medicaid Payment Reform, providers must demonstrate their “value” through measurable contributions to the attainment of the Triple Aim (New York State Department of Health, 2018). How does a provider, agency, or consortium of agencies demonstrate value? It is all in the measurement process. And therein lies great opportunity and tremendous peril.

Performance measurement is certainly not new to the healthcare industry nor is it unique to it. Frederick Winslow Taylor and his fellow progenitors of “scientific management” were among the first to apply the tools of the industrial efficiency movement to public school reform more than a century ago (Muller, 2018). Such “reforms” continue to abound, as manifest in the No Child Left Behind Act, the Common Core curriculum and myriad other initiatives of enduring influence and dubious merit. Other sectors including law enforcement, the military, business and finance have also been subject to various forms of performance measurement, most of which were borne of the seemingly noble intention to promote quality, efficiency, transparency and accountability in the name of a greater public good. The healthcare industry is another eminently logical target for reform inasmuch as it commands an ever-increasing share of our collective resources and produces uneven results, at best. By 2025 it is expected to consume a fifth of our nation’s GDP, nearly 50% of which will be shouldered by local, state and federal governments (Muller, 2018). This outsized investment has failed to produce a commensurate benefit to the public, however. The World Health Organization (WHO) ranks the American healthcare system as 37th overall (World Health Organization, 2018). Such an imbalance between investment and results is especially pronounced in our backyard. New York spends more per capita than any other state on healthcare (excepting New Mexico) for which it has achieved a mediocre ranking of 17th “best” based on cost, accessibility and outcomes (Robinson, 2018). The need for reform is indisputable, and it is hardly surprising this industry has been subject to a proliferation of process and performance measures in recent years that aim to enhance the quality and efficiency (i.e., value) with which its services are delivered. These are logical and laudable goals, but they run a grave risk of producing unintended consequences.

In 1976 the social psychologist Donald Campbell suggested quantitative measures used to inform policy decisions are invariably subject to corruption (Hess, 2018). His formulation gave rise to an eponymous law that exerts considerable influence in performance measurement programs. Perhaps the most common and recognizable manifestation of Campbell’s Law occurs within the realm of policing. The requirement that officers issue a minimum number of citations within a specified time period (i.e., quotas) as an indicator of productivity is fraught with obvious pitfalls. Similarly, standardized tests in public education may distort and compromise the educational process, especially when used as bases for instructors’ performance evaluations. “Teaching to the test,” a practice wherein instructors focus exclusively on material that might appear in such tests, is potentially detrimental to students inasmuch as it neglects other subject matter essential to their balanced and comprehensive education. Examples of such unintended consequences abound within the healthcare industry, and they will likely continue unabated in view a recent propagation of “pay for performance” arrangements between payers and providers. For instance, the ACA established the Hospital Readmissions Reduction Program (HRRP), an initiative that authorized the Centers for Medicare and Medicaid Services (CMS) to impose financial penalties on hospitals whose 30-day readmission rates exceed accepted standards (Muller, 2018). (A readmission within 30-days of discharge is often construed as a “failure” and indicative of ineffective treatment, inadequate discharge planning or other deficiencies that, if corrected, would preclude readmission.) The imposition of financial penalties produced the desired result, or so it seemed. Readmission rates decreased. Subsequent analyses revealed a potentially insidious trend, however. Between 2006 and 2013, the incidence of “observation stays” for Medicare patients nearly doubled (Muller, 2018). Patients placed on observation status are effectively admitted for relatively brief periods and receive a variety of services customarily provided to occupants of inpatient units. Services rendered to them are simply classified (and billed) differently. Most significantly, such observation stays are not considered readmissions for public reporting purposes nor do they incur penalties pursuant to HRRP provisions. Our state DSRIP program similarly aims to reduce potentially preventable hospital admissions and Performing Provider Systems (PPSs) that fail to achieve established metrics in this and related domains are subject to reduced remuneration. Perhaps not surprisingly, most PPSs are poised to achieve a 25% reduction in potentially preventable readmissions (the overarching goal of the DSRIP program) by 2020. By December of 2017, PPSs had already reduced such readmissions by 16.5% (New York State Department of Health, 2017). One cannot help but wonder whether this trend, promising as it seems, obscures a rise in other practices potentially deleterious to the welfare of many of our state’s most vulnerable recipients.

Our mounting preoccupation with performance measures also threatens to reduce inherently complex phenomena to unduly simple, albeit readily quantifiable, elements. Proponents of performance measurement often attempt to correct for this tendency via “risk adjustment,” a practice that calibrates measures to reflect unique characteristics of the specific objects of measurement. Nevertheless, this process often lacks the rigor or sophistication necessary to account for innumerable factors that influence the measurement process, as revealed by another analysis of hospital readmission rates among Medicare patients. An investigation of risk adjusted readmission rates exposed grave deficiencies in the adjustment algorithm applied to the population of interest insofar as it adjusted only for patients’ age, gender, discharge diagnosis and recent diagnoses. The authors stratified this population according to an additional 29 characteristics and found 22 of them were significantly predictive of readmission beyond the standard adjustments originally applied (Barnett, Hsu & McWilliams, 2015). Behavioral healthcare providers that aspire to establish value-based payment arrangements for complex populations, such as those enrolled in Health and Recovery Plans (HARPs), would do well to heed these findings lest they be unjustly penalized by risk adjustment algorithms that fail to account for the many medical, socioeconomic and other demographic factors unique to these populations that will surely influence the outcomes of their interventions.

Performance measurement is now deeply embedded in our health and behavioral healthcare system, its limitations notwithstanding. It is incumbent on providers and the emerging networks of which they are a part to carefully evaluate the opportunities it provides to advance the Triple Aim and the hazards associated with its application. Our system’s long-term sustainability and the health and welfare of the individuals entrusted to its care hang in the balance.

Mr. Brody may be reached at Search for Change at (914) 428-5600 (x9228), and by email at

Have a Comment?