The tech giant is aiming to get on a strong competitive footing against Amazon and Microsoft, but experts say there’s plenty of challenges for the technology in the world of medicine.
At the end of each hospital shift, the outgoing nurse has to quickly bring the incoming one up to speed about all of the patients under their care. This “handoff” can take multiple forms, including conversations, handwritten notes and electronic medical records. “[It’s] a risky part of the healthcare journey, because we’re transferring information from one healthcare provider to another,” says Michael Schlosser, senior vice president of care transformation and innovation for HCA Healthcare. “We have to make sure that it’s done in an accurate way and that nothing falls through the cracks.”
Schlosser and his team at Nashville-based HCA – one of the largest healthcare systems in the country with 180 hospitals and around 37 million patients a year – thought this transfer of information could be a good opportunity to apply generative artificial intelligence. Large language models are good at summarizing and organizing data. But when HCA scoured the market for potential vendors, Schlosser says they couldn’t find any companies building solutions for this handoff issue.
HCA had an existing partnership with Google Cloud, so they turned to Google’s software suite called Vertex AI, which helps customers build and deploy machine learning models. Google offers its own foundation model, known as PaLM, but the platform is model agnostic, meaning customers can also swap in and build on OpenAI’s GPT-4, Facebook’s Llama, Amazon’s Titan or any other model of their choosing.
In a bid to woo more healthcare customers, Google has also been developing a healthcare-specific large language model. The company announced Tuesday it will release the latest version – called Med-PaLM 2 – to a wider number of customers in September. HCA is one of several healthcare customers that has had early access, along with the pharmaceutical giant Bayer, electronic health record company Meditech and digital health startups Infinitus Systems and Huma. This renewed push into healthcare comes as Microsoft and Amazon are making their own AI-powered inroads into the sector, and it’s far from clear which will come out on top when the dust clears.
“We’re still five minutes into the marathon,” Gartner analyst Chirag Dekate says of the healthcare AI landscape.
In 2021, Google disbanded its standalone Google Health division but said health-related efforts would continue across the company. Its recent AI solutions in the industry are geared towards solving piecemeal problems. For example, Google released AI tools last year to help healthcare organizations read, store and label X-rays, MRIs and other medical imaging. Earlier this year, the company unveiled AI tools to help health insurers speed up prior authorization.
The use case focus is necessary because of AI technology itself, says Greg Corrado, head of health AI at Google. Despite the hype over large language models, he says it’s “naive” to expect them to be “able to do anything expertly off the shelf,” adding that “In practice, these systems always require identification of specific use cases.”
When it comes to large language models, Google has been playing catchup to OpenAI, the startup behind the viral chatbot ChatGPT, which has received $10 billion investment from Microsoft. In 2022, Microsoft acquired Nuance Communications for $18.8 billion, giving it a major foothold to sell new AI products to hospital clients, since Nuance’s medical dictation software is already used by 550,000 doctors. “Nuance has an enormous footprint in healthcare,” says Alex Lennox-Miller, an analyst for CB Insights, which makes Microsoft “well-positioned” for the use of its generative AI software for administrative tasks in the sector.
Before the generative AI boom, Amazon, Microsoft and Google were all competing for cloud customers. With $48.1 billion in cloud revenue in 2022, Amazon holds around 40% of the market share, according to technology research firm Gartner. Microsoft follows with 21.5%, while Google places fourth behind Alibaba Group with more than $9 billion in cloud revenue and 7.5% of the market.
It’s also no surprise that they are all now trying to specifically target healthcare customers, a complex and heavily regulated industry, says Dekate. He says that’s because if you’re able to prove use cases in a more complex environment, like healthcare or financial services, then it signals to other customers that generative AI is ready for broader adoption.
But no one is there yet. What all the cloud companies have presented to customers are building blocks, says Dekate. That is, plenty of ways to utilize their AI platforms in bespoke applications their customers have to build. But what those customers want are fully-built solutions.
“Amazon, Google and Microsoft are fighting it out to dominate the commanding heights of the generative AI economy,” says Dekate. “But none of them have articulated a good enough vertical story.”
Because healthcare is so highly regulated and the consequences of mistakes are high, generative AI use cases need to start out very small. For HCA that means one hospital – UCF Lake Nona – is currently piloting the handoff tool as a proof-of-concept. The AI ingests patient data from the past 12 hours, including lab results, medication, important events, and spits out a transfer summary, that also includes suggestions for what the oncoming nurse should be thinking about in the next 12 hours, says Schlosser.
While it’s built using Google’s Vertex AI software, HCA has been experimenting with different foundation models, including PaLM and Med-PaLM. “We’re actually doing a lot of head-to-head testing right now to see where does the generic model work better, and where does a medically-trained model provide more accuracy and better outcomes,” says Schlosser. “I imagine both will actually have important roles in the future we’re trying to create.”
The idea of using multiple models to solve a complex problem, known as “composite” artificial intelligence, presents an interesting challenge for the cloud providers, says Dekate. They are simultaneously offering their own in-house models but also partnering with other companies in order to offer “the promise of choice,” he says. Dekate expects that more and more we’ll start to see Google, Microsoft and Amazon start offering services to help customers be able to evaluate different models. Schlosser says HCA has so far been taking a manual approach to evaluation by having doctors and nurses evaluate the outputs of the model relative to what the human team would do as a side-by-side comparison.
Corrado says that at the state of the art right now, generative AI models can be likened to “an eager, studious assistant that’s trying very hard to do a good job. And you should view the output critically, as a draft and say, Okay, well, what did you miss? What did you get wrong?”
OpenAI has taken the view that bigger is better when it comes to the amount of data that the model is trained on. Its GPT-3 model, which was trained on the open internet, had around 175 billion parameters and the latest version, GPT-4, is thought to have more than 1 trillion parameters (though the company has not publicly confirmed the total amount). Google says the largest PaLM and Med-PaLM models have 540 billion parameters. The company declined to comment on the size of PaLM 2.
But as models are trained on more and more data, there can be issues with performance. In July, a group of researchers from Stanford and UC Berkeley said their tests suggested that GPT-4’s performance had suffered some degradation over time, echoing anecdotal reports that can be seen on developer fora. Although this was a preliminary finding and researchers are still learning how generative AI models work, this does spark some concern, especially as it’s not entirely clear how such AI systems arrive at their answers. “One of the biggest problems in healthcare for these algorithms is going to be the difficulty they have with transparency,” says Lennox-Miller.
Corrado says these concerns are precisely why Google is experimenting with niche LLM models that are trained on narrower sets of data. Without tailoring models towards specific use cases, such as healthcare, he says, “you run the risk of just having a Swiss army knife, which is not the best knife, and it’s also not the best screwdriver. And it’s also not the best toothpick. And we think that it’s better, particularly in these high value settings, to do domain adaptation, understand what the use case is, and have the same kind of rigorous quality evaluation and version control that you would expect from a real product.”
Another challenge for most large language models is that they’re not constantly learning. They typically have a cutoff date for their training data. For example, the free version of ChatGPT was trained on data until September 2021. But knowledge in healthcare is always advancing, so doctors who use these tools need to have a good sense of how recent the data they’re working with is. Corrado says Google is still deciding what the cutoff will be, but that it will be communicated to customers. “We don’t rely on these systems to know everything about the practice of medicine,” says Corrado.
In the hospitals of the future, Schlosser envisions an “AI assistant to the care team,” that he believes will have “amazing power in reducing administrative burden.” HCA has also been working with Google and the publicly-traded ambient AI company Augmedix to automate medical note-taking in the emergency room. Schlosser says around 75 doctors at 4 HCA hospitals are using the technology. The “holy grail” for doctors, he says, is that they could focus on providing care to patients and “the documentation would take care of itself.” The reason they’re starting in the emergency room is because that is one of the most complicated venues to prove the technology actually works.
When it comes to using Augmedix’s tool, the doctor directly asks the patient for their consent to record the examination and use an AI tool for note-taking, says Schlosser. For the nurse handoff tool, which is not patient-facing, it falls under HCA’s broader privacy consent around using patient data for research and process improvement, he says. HCA is also working on using generative AI for ER discharge summaries, as well as handoffs from the ER to inpatient. Schlosser says as HCA thinks about scaling the use of AI for administrative purposes, the company will have to consider “the right way to let all patients know that an AI is part of a care delivery process.”
Consent and privacy are major concerns around the use of AI in healthcare and Google generated significant controversy with an earlier partnership with the hospital system Ascension using AI to analyze millions of medical records. In 2019, reports of the company’s “Project Nightingale,” raised concerns about data privacy and security. Both Google and Ascension said the work was compliant with federal patient privacy laws.
In the case of PaLM and Med-PaLM, Google says that none of the models are being trained on patient data at HCA or any other customer. “HCA’s data is HCA’s data and nobody else’s,” Google Cloud CEO Thomas Kurian tells Forbes. “Think of it as a vault in our cloud that only is used to train the version of the model that they’re using. It’s not shared with anybody else. None of that data is used to improve our base model.”
Despite the challenges to generative AI from technical capabilities to privacy and data concerns, Schlosser is optimistic that tools built on technology will become part of the standard toolkit for doctors. HCA is taking a slow approach built on alleviating some of the burdens of their day to day job, he says, because he thinks once doctors start embracing AI, they’ll be positioned to guide the best way to use it for more complicated applications.
“I want clinicians to fully embrace AI as a partner that’s making their life easier, before we start getting into some of those more controversial areas,” he says.
MORE AT FORBES