How AI and analytics could solve healthcare’s big data problems

Published on May 25, 2023

How AI and analytics could solve healthcare’s big data problems

What's bigger than one trillion? The number of data points the healthcare industry is producing globally. The industry currently generates 30% of the world’s data volume, and the International Data Corporation (IDC) predicts that there are over 2,314,000,000 terabytes of data today, which has grown a staggering 11,000% since 2013. By 2025, healthcare will have become the fastest growing source of data worldwide. The problem is that while bad data in enterprise means a lost sale, bad data in healthcare could mean a lost life. But that’s also why we at SignalFire see huge opportunities for startups to build better data infrastructure for the healthcare industry.

Why the sudden data explosion within U.S. healthcare? We’re seeing increased adoption of electronic health records (EHRs), regulatory enforcement, and a growing popularity of wearables and other health tracking devices. These are producing a breadth of data types, including patient information, clinical notes, test results, imaging data, and claims data.

However, managing and analyzing this massive amount of data presents significant challenges. To get anything done with the data, you have to solve for the interoperability problem of systems communicating effectively with each other, data normalization, privacy, and security issues—before you even get to the sophisticated applications of data science.


‎Now’s the time, though. The world just got equipped with fresh AI tooling that can make sense of the seas of data flowing out of healthcare. And with everyone thinking about AI, many slower-moving incumbents will feel a sense of urgency to modernize their data stack.

In this post we’ll lay out some of the biggest technology and regulatory shifts affecting the healthcare data space, and a dozen specific opportunities where SignalFire is looking to invest. We’re going deep on healthcare data and the AI space given our bread and butter—we spent the past decade building our own proprietary AI data platform, Beacon, which tracks more than half a trillion data points, giving our portfolio companies unique insights into market intelligence. Our in-house expertise on data and machine learning gives us a unique lens into the power of data in healthcare and specific areas where we’re excited to back founders. 

Now let’s dive into the trends and opportunities around AI and analytics for hospitals, payors, pharma, and patients.

Part one: The data infrastructure layer

With AI, it’s garbage in, garbage out, so the industry first needs infrastructure to improve data quality. Before a model or analytics can be built on top of a data set, we need to address the following questions:

  1. Where do we get access to raw data?
  2. How do we cleanse and structure the data?
  3. How do we accurately join different datasets to create a full data record on a single patient?
  4. How do we store this data in a way that protects the patient’s privacy?

Until recently, raw healthcare data was becoming increasingly commoditized. But beginning in 2016 major new regulations emerged, starting with the 21st Century Cures Act.

The Cures Act mandated the bi-directional exchange of patient clinical data through the Trusted Exchange Framework and Common Agreement (TEFCA), with a growing number of approved use cases for data sharing. Essentially, TEFCA required every healthcare organization to make their data more accessible across states, hospitals, and provider networks so patients’ care teams always had the information they needed.

Now, in order to be eligible for access to this shared data, entities must receive the Qualified Health Information Network (QHIN) designation. After QHIN networks are fully established, only QHIN designees will be permitted to access the broader network of U.S. healthcare data, thereby raising the bar for other companies trying to solve the challenges at the data infrastructure layer. 

To be awarded the QHIN license, businesses need to build a highly compliant platform that can scale to enormous volumes of data. Among the first six entities to get a QHIN license alongside incumbents like Epic and Commonwell was SignalFire portfolio company Health Gorilla.


With a data lake that has access to the full longitudinal medical records of more than 90% of the U.S. patient population, Health Gorilla is opening up an extremely powerful data source for healthcare software developers and modeling how newer companies can work in tandem with regulators. They’ve solved a lot of the raw data access, cleanliness, integration, and privacy-safe storage issues to build a technical foundation for the next generation of solutions.

Part two: Building analytics and AI models on top of data

With improved infrastructure, companies can build unique analytics and AI models in highly verticalized categories within healthcare. These use cases often require specific data sets, allowing startups in this space to build data moats as a core part of their defensibility. Given this is a highly regulated space involving sensitive patient data, solutions here can distinguish themselves with top-grade privacy and security practices.


1. Analytics and AI applications for providers and hospitals

Providers are one of the major contributors to healthcare data—every time someone in this country completes a doctor’s visit, a medical record is generated. This data set, called clinical data, is one of the most valuable data sets because it captures the essence of what we need in order to practice healthcare—what are the patient’s symptoms, blood test results, medication history, etc. Here are a couple of areas where SignalFire is particularly excited: 

  • Personalized patient engagement: Knowing everything we ought to about a patient’s medical history, demographic information, and their consumer preferences, how do we proactively engage with them in a way that encourages them to come in for preventive visits, obtain further education on conditions they may be at higher risk for, provide education on offerings available to them, and ultimately help them achieve better outcomes? This would help providers proactively engage with their patients over the long term, increasing the hospital’s brand loyalty while reducing costs vs. reactively seeing patients as they need care.
  • Clinical intake intelligence: How many times have you sat at a doctor’s office with a clipboard and pen in hand, already five minutes late to your appointment but still needing to fill out a basic questionnaire? There’s been an effort to digitize this experience, but Health Note takes it to the next level by sending patients a digitally powered (i.e., via SMS) dynamic questionnaire (the next question changes based on your responses to the previous question) before their visit, mirroring what a doctor would ask in the first five minutes of the actual visit. The solution not only saves time for a front desk administrator but also a doctor whose clinical note is already halfway auto-generated at the time of the visit.
  • Clinical decision support: Having access to the entire patient medical record plus an AI tool enables more precise diagnoses and real-time intervention at a higher accuracy than what humans can accomplish alone. The overall adoption of these models is relatively limited today and typically needs continuous data from outside the four walls through an integrated continuous management system like all.Health. Where we see great potential is in tools that can assist clinicians, not replace them (e.g., providing a second set of eyes), speeding up diagnosis time by providing an assessment that a clinician can review. For example, Recora Health’s virtual cardiac platform is able to surface to providers who is more likely to have another heart attack after only several virtual visits. 

Coding automation: Empowered providers get paid faster and bill more accurately using AI models to autogenerate a billing code based on an unstructured doctor’s note. SignalFire led the Series A in CodaMetrix, which has a unique competitive advantage in this space, having spun out of Mass General Brigham—making it a data moat around high-quality training data (read more about our investment here). ‎

2. Analytics and AI applications for payors

Payors’ business models—effectively an insurance business—inherently create incentive alignment with solutions that are using AI and analytics to drive down the cost of care while improving outcomes. Below are several examples of problem statements solved by companies using data and AI:

  • Medication adherence and management: The entire payor ecosystem pays an estimated $300 billion annually for: medications that don’t get consumed; more expensive medications vs. generic equivalents; and medications that patients no longer need. Better data can help create a fuller picture of a patient’s existing conditions and engage with them in a highly personalized way, using behavioral economics principles to nudge them to take the right medicine at the right time. It’s why we invested in Wellth.
  • Population health management: Every payor typically manages hundreds of thousands to millions of lives. Because they’re ultimately responsible for paying the bill, it’s important they understand how healthy their population is and which segments would benefit from proactive management of their health. A data-driven solution like Color would review the entire patient population data across all attributes and help patients navigate to the appropriate care they need. 
  • Payment integrity: Annually, $200–300 billion is spent on claims waste, fraud, and abuse. Ninety percent of the time, the reason payors overspend on claims comes down to human error—the person on the provider side has made a mistake and asked for more money than they should collect for a visit. The autonomous coding solution from CodaMetrix not only directly addresses this problem, but—with increased adoption—could establish the common language that would allow payors and providers to transact in an equal and fair manner.

3. Analytics and AI applications for pharma

Pharma spends, on average, over $1 billion and 10 years for a successful drug to come to market. Any data-driven and AI solutions that can expedite the drug development timeline or reduce costs are highly attractive to pharma:

  • Drug discovery: For researchers to identify what drugs to develop, they need to first know what conditions are becoming more prevalent in our population, and more costly. Access to anonymized patient data can create a high-level view of conditions; and it can allow for subsequent deep dives into the patient journey for those living with a given condition who are undergoing different treatments. A strong, valuable dataset like that of  Ovation.io can help more quickly identify which drugs to develop—a key benefit when considering it can take years to get a new drug to the market. 
  • Synthetic clinical trial arm: Real-world evidence data helps pharma recruit the most relevant patients for clinical trials as well build a “synthetic control arm”—instead of collecting data from patients recruited for a trial who have been assigned to the control group, synthetic control arms model those comparators using data and thus save pharma both time and money on the other half of these long and expensive trials.
  • Post-FDA targeting: After a new drug has passed stage-four approval—allowing it to be marketed—the question is, who is the best target market? Data can help identify key opinion leaders, as well as the prescribing physicians for patients whose profile is a match for the drug.

4. Analytics and AI applications for patients

At the end of the day, all these solutions above that work with providers, payors, and pharma will always benefit the patient downstream in one way or another, as the patient is the center of our healthcare ecosystem. However, here are several other ways in which data insights and availability can help us directly:

  • Individual medical record access: Patients with chronic and rare diseases are currently tasked with manually assembling their information to get the best treatment possible. Currently under TEFCA, only certain use cases of data sharing are approved—a provider can pull information if they’re treating a patient, but a patient cannot directly pull information on themselves. We think an individual use case is going to be unlocked in the next year, helping everyone from the overburdened patient with clinical illness to the person who’s simply trying to keep track of their immunization records.
  • Patient payments: Better data can help patients afford their healthcare. Payzen uses large amounts of patient data—spanning medical history, demographics, frequency of visits, and more—to provide patients with a personalized medical bill payment plan that has a 0% interest rate.

‎Building for healthcare? We want to hear from you

If you’re working on a startup in this space, we’d like to chat. Cold emails are welcome at yy@signalfire.com to connect with Yuanling Yuan (she goes by YY) from our healthcare investment team.


Yuanling (YY) Yuan, from SignalFire’s healthcare investment team

‎At SignalFire, we like to say, “Think of us as an extension of your team that scales with you.” Beyond our in-house Beacon AI for help with recruiting, we built our full-time Portfolio Experience team with world-class operators across a variety of functions, including the former chief people officer at Netflix for developing an engineer hiring strategy, the chief marketing officer at Stripe to optimize your sales process, and the former editor-at-large at TechCrunch to help you convert the value you deliver into a persuasive story. Our XIR program, meanwhile, pairs top industry leaders with high-potential companies as they scale and includes healthcare luminaries like Evolent Health ($EVH) founders Frank Williams and Tom Peterson.

We love helping healthcare companies solve their internal problems so they can heal the world. That approach of providing value far beyond our capital is why we have a net promoter score of 85 among founders, with 85% saying we are the most valuable investor on their cap table.

If you’re working on a company in the healthcare data, analytics, and AI space, come talk with us. We’ll share our full research and connections, and hope to earn the chance to hear about your next fundraise. By unlocking the secrets trapped within our medical data, we can build a healthier future for everyone. We can’t wait to see what you’re building.‎



SignalFire may engage Affiliate Advisors, Retained Advisors, and other consultants as listed above to provide their expertise on a formal or ad hoc basis. They are not employed by SignalFire and do not provide investment advisory services to clients on behalf of SignalFire. For more information on their specific roles, please contact us. Portfolio Company Endorsements: Certain portfolio company founders or Affiliate Advisors listed above may or may not be current investors in a SF fund in which they receive a fee reduction. Such fee reductions were not provided in exchange for or an incentive for their feedback, nor contingent upon the individual’s approval for SignalFire’s continued use. Please refer to our website for additional disclosures.

*Portfolio company founders listed above have not received any compensation for this feedback and may or may not have invested in a SignalFire fund. These founders may or may not serve as Affiliate Advisors, Retained Advisors, or consultants to provide their expertise on a formal or ad hoc basis. They are not employed by SignalFire and do not provide investment advisory services to clients on behalf of SignalFire. Please refer to our disclosures page for additional disclosures.

Related posts

Centaur Labs: Transforming medical data annotation
Portfolio
Investment
October 8, 2024

Centaur Labs: Transforming medical data annotation

We’ve earmarked $50M for the SignalFire AI Lab to provide the resources, capital, and credibility to help tomorrow’s AI leaders today.
Why LegalTech needs less AI magic (and a little more change management)
Investment
September 16, 2024

Why LegalTech needs less AI magic (and a little more change management)

We’ve earmarked $50M for the SignalFire AI Lab to provide the resources, capital, and credibility to help tomorrow’s AI leaders today.
The business-in-a-box model: Grow Therapy’s playbook
Investment
August 29, 2024

The business-in-a-box model: Grow Therapy’s playbook

We’ve earmarked $50M for the SignalFire AI Lab to provide the resources, capital, and credibility to help tomorrow’s AI leaders today.