SHARE

Why Hospitals Still Need People to Monitor and Maintain AI Tools

The promise of artificial intelligence (AI) in health care is undeniable: cutting-edge algorithms can predict patient outcomes, guide difficult treatment decisions, and streamline documentation processes for doctors. Yet, as more hospitals and clinics embrace these tools, a paradox emerges—many of these AI systems require significant human oversight and continuous upkeep, which can be costly and labor-intensive. Below, we explore the challenges hospitals face in adopting AI, the hidden costs, and why experienced professionals remain essential despite rapid advancements in machine learning.

AI Tools and Their Real-World Impact

Cancer Care at Penn Medicine

AI holds particular promise in oncology, where doctors must balance aggressive treatments with palliative or end-of-life care discussions. At the University of Pennsylvania Health System, an algorithm predicts a patient’s chances of death, nudging doctors to discuss treatment goals and end-of-life options. But the pandemic exposed its shortcomings. A 2022 study found the tool grew seven percentage points less accurate during COVID-19, translating to missed opportunities for critical discussions.

Ravi Parikh, an oncologist at Emory University and the study’s lead author, estimates the algorithm failed hundreds of times to prompt doctors about these end-of-life conversations. Similar models at other institutions may have faltered, too, especially if their performance wasn’t monitored routinely.

The Challenge of Maintaining AI Algorithms

Performance Decay and “Nondeterminism”

Over time, AI algorithms can “decay,” meaning they become less accurate if data patterns change. This can happen for obvious reasons—like when a hospital switches lab providers—or for seemingly inexplicable ones. At Mass General Brigham, a system designed to help genetic counselors locate relevant information about DNA variants showed “nondeterminism”: it gave different answers to the same questions just moments apart.

The Human Element in AI Oversight

Stanford Health Care’s chief data scientist, Nigam Shah, highlights another dilemma: AI systems can improve patient care but might also increase overall costs. "Everybody thinks AI will help us with our access and capacity,” he said. “All of that is nice and good, but if it increases the cost of care by 20%, is that viable?”

These higher costs often stem from hiring specialized staff to monitor and validate algorithms over time. In one study at Stanford, a team spent 8–10 months and 115 man-hours just auditing two AI models for fairness and reliability.

Why Hospitals Struggle With AI Validation

Lack of Standards and Benchmarks

Evaluating whether an AI tool works—and continues to work—can be extremely difficult. According to Jesse Ehrenfeld, immediate past president of the American Medical Association, “We have no standards. There is nothing I can point you to today that outlines how to evaluate and monitor an AI model once it’s deployed.”

FDA Commissioner Robert Califf echoed these concerns, saying he doubts any single U.S. health system can fully validate an AI algorithm before integrating it into patient care.

Varying Levels of Accuracy

Different AI tools also perform differently under the same conditions. In a recent Yale Medicine study, six “early warning systems” gave widely divergent results. Without a universal benchmark, hospitals struggle to choose the right product. Meanwhile, doctors are left unsure whether to trust or question these automated suggestions.

The Big Business of Health Care AI

Despite these challenges, investment in AI health care startups is growing rapidly. Bessemer Venture Partners has identified around 20 health-focused AI companies poised to make over $10 million in annual revenue each. The FDA has already approved nearly a thousand AI-based medical products, ranging from insurance claim software to clinical diagnostic tools.

Ambient Documentation Tools

One of the most common AI products in doctors’ offices is “ambient documentation,” which listens to and summarizes patient visits. In 2022, investors poured $353 million into these platforms. Yet, experts warn of high error rates. At Stanford, large language models showed a 35% error rate when summarizing patient histories—potentially dangerous in complex medical scenarios where missing “one word, like ‘fever,’” can change the course of treatment.

The Path Forward: Continuous Monitoring and Human Oversight

To mitigate the risks of AI “messing up,” institutions must commit resources to monitor these tools continuously. That often includes:

Regular Performance Audits – Checking accuracy and fairness over time.

Algorithm Maintenance – Updating models whenever patient demographics, lab providers, or care protocols change.

Human Validation – Employing data scientists and clinicians to interpret results, spot errors, and refine the algorithms.

Some experts envision deploying AI to monitor AI—creating new layers of automation to spot errors before they harm patients. However, this introduces added costs and still requires people with the right expertise to oversee it all.

Balancing AI’s Promise With Practical Realities

Hospitals and health care executives face a balancing act. On one hand, AI offers transformative possibilities: personalized treatments, improved patient engagement, and more efficient workflows. On the other hand, integrating AI demands investments in staff training, continuous monitoring, and technology upgrades. As Shah puts it, “Is that really what I wanted? How many more people are we going to need?”

For now, the future of AI in health care likely involves human-machine collaboration. While algorithms can crunch vast amounts of data quickly, skilled professionals remain indispensable for spotting errors, maintaining performance, and making nuanced decisions about patient care.

Key Takeaways for Health Care Leaders

Ongoing Monitoring Is Critical: AI can degrade over time; regular audits help detect and correct performance issues.

Human Oversight Remains Essential: Skilled clinicians and data scientists must supervise and update AI tools to ensure accuracy.

Invest in Infrastructure: Hospitals should allocate budget for technology, personnel, and training to maximize AI’s benefits.

Push for Industry Standards: Better benchmarks can help hospitals evaluate AI tools effectively, reducing risks for patients.

In the end, AI is not a “set-it-and-forget-it” technology. It holds enormous potential for revolutionizing health care, but unlocking its benefits requires long-term commitment, careful oversight, and—ironically—human expertise and resources to keep costs and risks in check.