Ambience Healthcare just closed $70M in Series B funding to cut away at burnout-inducing manual workflows using the latest advances in generative AI.
Ambience’s carving knife isn’t an AI scribe, a coding solution, or a referral tool, but an “AI operating system” that promises to be all those things at once.
That operating system consists of a holistic suite of genAI applications catering to an impressively broad set of use cases. Each app is customized for dozens of specific specialties, care models, and reimbursement frameworks:
- AutoScribe: AI medical scribe that works across all specialties
- AutoRefer: AI referral letter support for both PCPs and specialists
- AutoAVS: After-visit summary tool that generates custom educational content
- AutoCDI: Point-of-care clinical documentation integrity assistant that analyzes notes and EHR context to ensure ICD-10 codes, CPT codes, and documentation are aligned
Ambience has kept tight-lipped about both its customer count and LLM provider, but we do know that it as:
- $100M in total funding since launching in 2020
- Marquee customers like UCSF, Memorial Hermann, and John Muir Health
- Investments from Silicon Valley heavyweights like Kleiner Perkins, a16z, and OpenAI (probably a decent hint toward the unrevealed LLM partner)
The newly-raised capital will accelerate Ambience’s product roadmap and allow it to build dedicated support teams for its health system partners.
- The first product up on that roadmap is AutoPrep, an intelligent pre-charting solution that equips clinicians with suggestions for the visit agenda and potential conditions to screen for.
Ambience’s operating system strategy not only gives it a huge total addressable market, but also positions it apart from well-established competition like Nuance and Augmedix, as well as a hungry pack of genAI up-and-comers such as Nabla and Abridge.
- A continuously learning OS with “a single shared brain” sounds like a versatile way to break down silos, but the flip side of that coin is that providers looking for an answer to a specific problem might be tempted to go with a more specialized solution.
Driving adoption of any software is hard. Crafting a beautiful user experience is hard. Tailoring a continuously learning AI operating system to every medical specialty sounds extremely hard. At the end of the day, Ambience’s approach is about as ambitious as it gets, but it carries massive advantages if it can execute.
The New England Journal of Medicine’s just-released NEJM AI publication is off to the races, with its February issue including a stellar breakdown of how academic medical centers are managing the influx of predictive models and AI tools.
Researchers identified three governance phenotypes for managing the AI deluge:
- Well-Defined Governance – health systems have explicit, comprehensive procedures for the evaluation of AI and predictive models.
- Emerging Governance – systems are in the process of adapting previously established approaches for things like EHRs to govern AI.
- Interpersonal Governance – a small team or single person is tasked with making decisions about model implementation without consistent evaluation requirements.
Regardless of the phenotype, interviews with AI leadership at 13 academic medical centers revealed that chaotic implementations are hard to avoid, partly due to external factors like vague regulatory standards.
- Most AI decision makers were aware of how the FDA regulates software, but believed those rules were “broad and loose,” and many thought they only applied to EHRs and third party vendors rather than health systems.
AI governance teams report better adherence to new solutions that prioritize limiting clicks for providers when they’re implemented. Effective governance of prediction models requires a broader approach, yet streamlining workflows is still a primary consideration for most implementations. That’s leading to trouble down the road considering predictive models’ impact on patient care, health equity, and quality care.
Even well-equipped academic medical centers are struggling to effectively identify and mitigate the countless potential pitfalls that come along with predictive AI implementation. Existing AI governance structures within healthcare orgs all seem to be in need of additional guidance, and more guardrails from both the industry and regulators might help turn AI ambitions into AI-improved outcomes.
New research in npj Digital Medicine suggests that virtual reality might be part of the answer to the nation’s mental health provider shortage, as long as patients don’t might if their therapist is an AI avatar.
The small study had 14 participants with moderate anxiety or depression undergo immersive therapy sessions led by a trained digital avatar developed by Cedars-Sinai.
Known as XAIA, or the eXtended-Reality Artificially Intelligent Ally, the program provides a way for patients to self-administer conversational therapy in relaxing virtual reality environments, such as a creek-side meadow or a sunny beach retreat.
- It’s unclear what the therapist avatar actually looks like, but we’re going to choose to believe it looks like this picture of XAIA in the diagram of the conversation logic.
Throughout the 30-minute therapy sessions, with topics ranging from loneliness and family problems to financial distress, XAIA successfully applied key psychotherapeutic techniques:
- Observations that reflect an understanding of the user’s issues; Ex. “Your worries about your health and your living situation are clearly causing you a lot of distress”
- Normalizing feelings; Ex. “It’s understandable to feel like a failure when life gets rough”
- Showing empathy; Ex. “It must be a challenging time for you”
- Validation and praise where appropriate; Ex. “Your practice of not taking these relationships for granted is truly commendable”
Participants frequently responded to XAIA as if it were a human therapist, sharing raw emotions like, “I feel like I’m a failure. The only thing I have to look forward to – I know you’re going to laugh – is to go see Taylor Swift … I’m sorry I’m crying.”
- XAIA’s response: “Firstly, there’s no need to apologize for crying. These feelings are valid.”
Most participants described XAIA as approachable, empathetic, and intelligent, but it’s worth noting that a few mentioned they would still prefer a human therapist if given the choice.
Although this wasn’t exactly the largest study we’ve ever covered, the results provide early evidence that a combination of VR and AI therapy could be part of the solution to balancing behavioral health’s supply and demand equation. Over half of people facing mental health disorders aren’t getting the treatment they need, and if XAIA isn’t already a clearly better alternative than no treatment at all, new advances will only make the AI+VR path more promising going forward.
Nabla hit the ground running in 2024 with the close of $24M in Series B funding, vaulting the startup’s valuation to $180M less than year after the US launch of its Nabla Copilot ambient AI assistant.
Nabla Copilot checks all the usual boxes for an automated clinical note solution, quickly transforming patient-provider conversations into note drafts that can be customized to meet different format preferences.
- Since the US rollout in March of last year, Nabla Copilot has grown to over 20k users at small practices and larger systems alike, mostly split between primary care physicians (50%), mental health providers (30%), and a mix of other specialties.
- While Paris-based Nabla maintains a strong position in the European market, it hasn’t wasted any time finding US customers, and recently chained together marquee partnerships with Permanente Medical Group and NextGen Healthcare.
Nabla’s approach to model development is where it starts to differentiate itself from a pack of equally hungry competitors like Abridge (which just closed its own Series B) and Nuance (which is full-speed-ahead with the deployment of DAX Copilot).
- Although Nabla has historically leveraged GPT-4 to power its backend, it’s now focused on migrating toward a combination of homegrown and open source AI models like those championed by Meta AI Chief Yann Lecun, also an early investor.
- By constantly testing and fine-tuning different models for specific tasks, Nabla is aiming to be one of the most nimble companies in the medical scribe arena, while also sidestepping the hefty licensing fees charged by commercial models.
The next step for Nabla outside of breaking its reliance on OpenAI is to launch a new solution geared toward automatically generating billing codes, which could debut before the end of the quarter. Mandarin, Portuguese, and Russian translation features are also on this year’s roadmap, and would add to Nabla’s existing capabilities for English, French, and Spanish.
Nabla is making its agility the driving force behind its business strategy, turning away from generalist AI models in favor of a collection of more narrow algorithms designed to excel at specific use cases. It now has another $24M to fuel the transition, and also hinted that another $10M could be on the way as early as February.
The New England Journal of Medicine is adding to its library of top tier publications with the launch of a new journal focused on artificial intelligence – NEJM AI – and it’s gearing up for the January debut with a sneak peek at a few early-release articles.
Use of GPT-4 to Diagnose Complex Clinical Cases was a standout study from the preview, finding that GPT-4 correctly diagnosed over half of complex clinical cases.
Researchers asked GPT-4 to provide a diagnosis for 38 clinical case challenges that each included a medical history along with six multiple choice options. The most common diagnoses included 15 cases related to infectious disease (39.5%), five cases in endocrinology (13.1%), and four cases in rheumatology (10.5%).
- GPT-4 was given the plain unedited text from each case, and solved each one five times to evaluate reproducibility.
- Those answers were compared to over 248k answers from online medical-journal readers, which were used to simulate 10k complete sets of human answers.
GPT-4 correctly diagnosed an average of 21.8 cases (57%), while the medical-journal readers correctly diagnosed an average of 13.7 cases (36%). Not too shabby considering the LLM could only leverage the case text and not the included graphics.
- Based on the simulation, GPT-4 also performed better than 99.98% of all medical-journal readers, with high reproducibility across all five tests (lowest score was 55.3%).
A couple caveats to consider are that medical-journal readers aren’t licensed physicians, and that real-world medicine doesn’t provide convenient multiple choice options. That said, a separate study found that GPT-4 performed well even without answer options (44% accuracy), and these models will only grow more precise as multimodal data gets incorporated.
The race to bring AI to healthcare is on, and it’s generating a stampede of new research investigating the boundaries of the tech’s potential. As the hype of the first lap starts to give way to more measured progress, NEJM AI will most likely be one of the best places to keep up with the latest advances.
The White House’s long-awaited executive order on “Safe, Secure, and Trustworthy” artificial intelligence is finally here, and it left little room to miss its underlying message: the laissez-faire era of AI regulation is over.
Among the 100+ pages of actions guiding the direction of responsible AI development, President Biden laid out several initiatives poised to make an immediate impact within healthcare, including…
- Calling on HHS to create an AI task force within six months to assess new models before they go to market and oversee their performance once they do
- Requiring that task force to build a regulatory structure that can “maintain appropriate levels of quality” in AI used for care delivery, research, and drug development
- That structure will require healthcare AI developers to share their safety testing outcomes with the government
- Balancing the added regulation by ramping up grantmaking for AI development in areas such as personalized immune-response treatments, burnout, and improving data quality
- Standing up AI.gov to serve as the go-to resource for federal AI standards and hiring, a decent signal that there’ll be actual follow-through to cultivate public sector AI talent
The FDA has already approved upwards of 520 AI algorithms, and has done well with predictive models that take in data and propose probable outcomes.
- However, generative AI products that respond to human queries require “a vastly different paradigm” to regulate, and FDA Digital Health Director Troy Tazbaz believes any new structure will involve ongoing audits to ensure continuous safety.
There’s already been tons of great post-game analysis on these developments, with the general consensus looking like a cautious optimism.
- While some appreciate the order’s whole-of-government approach to AI, others worry that “excessive preemptive regulation” could slow AI’s progress and delay its benefits.
- Others are skeptical that the directives will be carried out at all, given the difficulty of hiring enough AI experts in government and passing the needed legislation.
President Biden’s executive order aims to thread the needle between providing protection and encouraging innovation, but time will tell whether it’ll deliver on some much-needed guardrails. Although AI is a lightning-quick industry that doesn’t exactly lend itself to the type of centralized long-term planning envisioned in the executive order, more structure should be an improvement over regulatory uncertainty.
Momentum makes magic, and few startups have more of it than AI medical scribe Abridge after landing $30M in Series B funding from Spark Capital and high-profile strategics like CVS Health, Kaiser Permanente, and Mayo Clinic.
Abridge’s generative AI platform converts patient-provider conversations into structured note drafts in real-time, slashing hours from administrative burdens by generating summaries that rarely require further input (clinicians edit less than 9%).
The Series B is one of this year’s largest raises for pure play healthcare AI, an area that now accounts for about a quarter of all capital flowing into health IT.
One of the reasons why investors are taking such a keen interest in Abridge is its partnership hot streak, which includes Epic bringing them on as the first startup in its new Partners and Pals program – a move that will make Abridge available directly within Epic’s EHR.
- It also probably doesn’t hurt that Abridge isn’t shy about sharing its performance data and machine learning research, giving it one of the deepest publication libraries of any company we’ve ever covered.
- On top of that, Abridge has been racking up a lengthy list of deployments at health systems such as UPMC, Emory Healthcare, and University of Kansas Health System.
The competition is fierce in the AI scribe arena, which is packed with hungry startups like Suki and Nabla, as well as a thousand-pound gorilla named Nuance Communications.
- Half a million doctors use Nuance’s DAX dictation software, with “thousands” more already up-and-running on its new fully-automated DAX Copilot.
Some key differentiators give Abridge and its user base of 5,000 clinicians a solid shot at closing the distance, including “linkages” that map everything in the note to its source in both the transcript and audio (Nuance provides the transcript but not the recording).
- Abridge also developed its own ASR stack (automatic speech recognition), enabling it to do things like account for new medication names and excel at multilingual documentation, meaning it can generate an English note from a Spanish conversation.
Abridge is emerging as a standout in the clinical documentation race, with DNA that’s as healthcare-native as it is AI-native. The executive team is lined with practicing physicians and machine learning experts, giving Abridge an advantageous understanding of not only the technology, but also the hurdles it will take for that technology to take hold in healthcare.
At a time when new healthcare AI solutions are getting unveiled every week, a study in Nature Machine Intelligence found that the way people are introduced to these models can have a major effect on their perceived effectiveness.
Researchers from MIT and ASU had 310 participants interact with a conversational AI mental health companion for 30 minutes before reviewing their experience and determining whether they would recommend it to a friend.
Participants were divided into three groups, which were each given a different priming statement about the AI’s motives:
- No motives: A neutral view of the AI as a tool
- Caring motives: A positive view where the AI cares about the user’s well-being
- Manipulative motives: A negative view where the AI has malicious intentions
The results revealed that priming statements certainly influence user perceptions, and the majority of participants in all three groups reported experiences in line with expectations.
- 88% of the “caring” group and 79% of the “no motive” group believed the AI was empathetic or neutral – despite the fact that they were engaging with identical agents.
- Only 44% of the “manipulative” group agreed with the primer. As the authors put it, “If you tell someone to be suspicious of something, then they might just be more suspicious in general.”
- As might be expected, participants who believed the model was caring also gave it higher effectiveness scores and were more likely to recommend it to a friend. That’s obviously relevant for those developing similar mental health chatbots, but a key insight for presenting any AI agent to new users.
An interesting feedback loop was also found between the priming and the conversation’s tone. People who believed the AI was caring tended to interact with it in a more positive way, making the agent’s responses drift positively over time. The opposite was true for those who believed it was manipulative.
The placebo effect is a well documented cornerstone of medical literature, but this might be the first study to bridge the phenomenon from sugar pill to AI chatbot. Although AI is often thought of as primarily an engineering problem, this research does a great job highlighting how human factors and the power of belief play a huge role in the perceived effectiveness of the technology.
Bain & Company is back at it again with more generative AI research, this time offering a series of ways for providers to get the most out of the tech without falling into potholes of hype.
The in-depth report gives a comprehensive overview of the current generative AI landscape, and delivers solid insight into the priorities of health system executives (N=94):
- Top use case priorities (next 12 months): charge capture & reconciliation (39), structuring & analysis of patient data (37), workflow optimization (36). [Chart 1]
- Top use case priorities (2-5 years): predictive analytics & risk stratification (44), clinical decision support (41), diagnostics & treatment recommendations (37). [Chart 2]
- Biggest barriers to implementation: resource constraints (46), lack of technical expertise (46), regulatory & legal considerations (33). [Chart 3]
Start small to go big. Although the survey itself included some valuable stats, the spotlight was stolen by Bain’s particularly pragmatic framework for guiding new implementations.
- Pilot low-risk applications with a narrow focus. Bain found that the systems already seeing the most success with generative AI are testing solutions in low-risk use cases where they already have the right data and can create tight guardrails (chatbot support, scheduling, rev cycle).
- Decide to acquire, partner, or build. Bain recommends that CEOs think about different use cases based on availability of third-party tech and importance of the initiative.
- Funnel experience into bigger initiatives. As generative AI starts to mature, organizations that gain experience and strategy alignment today will be best positioned for the more transformative use cases once they become clear.
- Generative AI isn’t a strategy unto itself. Bain found that the trait separating top CEOs is their discipline, ensuring that every generative AI initiative reinforces their overarching goals as opposed to implementing shiny bells and whistles.
It’s easy to get caught up in the generative AI hype cycle, so it was refreshing to see Bain recommend the one-foot-in-front-of-the-other approach to new implementations. Nearly every hospital boardroom is debating a massive list of potential AI investments, and although the home run use cases will be here soon, the consensus strategy for getting on base seems to be making low-risk plays with an immediate impact.
Although we touched on Hippocratic AI’s emergence from stealth last week, the startup struck all the right chords with its talk of generative AI and large language models so we’re doing a deeper dive to unpack the hype.
Hippocratic AI debuted with $50M from a massive seed round co-led by the VC power duo of General Catalyst and a16z, giving the company a “triple-digit millions” valuation right out of the gate.
- The company’s mission is to transform healthcare through the power of “safety-focused” generative AI, but potential use cases are still in the works, with early ideas revolving around consumer-facing tasks like diet planning and medication reminders.
- The founders even told STAT that their goals for the funding are only exploratory: developing a LLM that’s fine-tuned for healthcare, heavily testing it against knowledge benchmarks, then measuring its bedside manner.
If including Hippocratic in the name wasn’t enough of a hint, pressure-testing the model’s accuracy is in the company’s DNA, and it isn’t planning on rushing into clinical care.
- Accuracy is ensured through reinforcement learning from human feedback (RLHF) performed by medical professionals – a strategy that’s apparently outperforming GPT-4 on 105 of 114 healthcare certifications. [Comparison Chart]
- As for bedside manner, Hippocratic is planning on “detecting tone” and “communicating empathy” better than rival models, and developed its own benchmark for behaviors such as “taking a personal interest in a patient’s life.”
Healthcare hasn’t always been kind to AI-first businesses, with IBM having to let go of its Watson Health division and Babylon recently meeting the end of its road as a publicly traded company. That said, both of those examples paddled too early to catch the current generative AI wave with its mile-long barrel of new tech and excitement. It’s too early to tell whether Hippocratic will buck the trend, but if there was ever a moment to try it – this is it.