...

What AI-Driven De-Identified Healthcare Data Means for Patient Outcomes – with Leaders from NLP Logix and Orlando Health


This article is sponsored by NLP Logix and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page.

The healthcare sector is witnessing a dramatic surge in digital information. According to IDC’s white paper published in partnership with Seagate, healthcare data was expected to grow at a 36% CAGR between 2018 and 2025; today, this surge has made healthcare one of the world’s leading sources of data volume.

According to a study from Healthcare Informatics Research at Chungnam National University in Daejeon, Korea, approximately 80% of medical data in the healthcare field remains unstructured and untapped after it is created, including formats such as text, images, and signals. 

The challenge of turning this data into insight is balancing innovation with privacy. Another research paper from Administrative Data Research UK, funded by the Economic & Social Research Council, explains that with tight regulatory scrutiny and fragile public trust in data, organizations are under pressure to find responsible ways to utilize their data. 

Between these points of regulatory scrutiny and driving patient value is where de-identification comes in. By removing or masking personally identifiable information (PII) while preserving the clinical relevance of data, de-identification allows teams to use and share patient-level details safely. It not only enables regulatory compliance, but it also builds a foundation of trust. 

Emerj recently hosted a special series on the ‘AI in Business’ podcast, featuring Ben Webster, Modeling and Analytics Team Lead at NLP Logix, and Brad Kennedy, Senior Director of Business Solutions Strategy at Orlando Health, to discuss how to balance data innovation with patient safety, using de-identification, benchmarking, and education to ensure responsible and effective AI adoption in healthcare.

Their conversation highlights the crucial importance of aligning patient trust, privacy, and clinical outcomes when implementing AI in healthcare. It underscores the importance of effective de-identification protocols, clear patient communication, and outcome-driven benchmarks in ensuring responsible innovation that does not compromise data ethics or trust.

This article examines two key insights from these conversations for healthcare leaders deploying AI at their organizations:

  • Driving AI efficiency by balancing compliance and change: Prioritizing flexible compliance strategies while actively adapting workflows to accelerate AI adoption and deliver measurable value.
  • Implementing de-identification protocols to enable responsible innovation: Building robust de-identification processes to safely leverage patient data for AI and technology development, without compromising privacy.

Driving AI Efficiency by Balancing Compliance and Change

Episode: De-Identified Data and AI Adoption in Healthcare – with Ben Webster of NLP Logix

Guest:  Ben Webster, Modeling and Analytics Team Lead, NLP Logix

Expertise: Advanced Analytics, Predictive Modeling, and Sentiment Analysis

Brief Recognition: Ben has spent the last 10 years at NLP Logix, first as a data scientist from 2013 to 2021 before being promoted to his current position as Modeling and Analytics Team Lead. In 2016, he earned his master’s degree in Mathematics and Statistics from the University of North Florida.

Ben opens his podcast by explaining that de-identification is often driven by legal and regulatory concerns, particularly to remain compliant with HIPAA, and typically arises when teams want to use real-world data for experimentation. He outlines two primary methods organizations use to de-identify data:

  • Safe Harbor Method: It involves removing 18 types of identifiers, ranging from obvious ones, such as names and photos, to less obvious ones, including IP addresses and dates of birth. While it is legally straightforward, it often strips too much data, reducing its usefulness for research.
  • Expert Determination: In this case, an expert assesses whether the data can reasonably be used to re-identify individuals. This method offers more flexibility but can cause delays, with legal reviews sometimes taking months and stalling projects before experimentation even begins.

He mentions the growing interest in structuring healthcare information to support billing accuracy, medication decisions, and patient home care insights, much of which remains hidden in free-text notes. He notes two key trends in the market:

  • Utilizing LLMs to create conversational interfaces that enable clinicians to query data intuitively, and 
  • Developing tools to translate and standardize data as patients transition across international healthcare systems with varying languages and standards.

Ben also shares a key lesson in choosing between building a solution in-house or using off-the-shelf software:

“Third-party software solutions often advertise performance metrics based on generic datasets, but those benchmarks may not reflect your specific use case. A tool that claims 90% accuracy could perform better — or significantly worse — when applied to your own data.

These differences can make or break a project, which is why it’s essential to validate performance on representative datasets. To do that, you need access to de-identified data that you can safely run through the system to assess its real-world effectiveness. Of course, this also requires careful consideration of privacy concerns, particularly when dealing with PHI and third-party APIs.”

—Ben Webster, Modeling and Analytics Team Lead at NLP Logix

He stresses that even highly accurate AI offers no real benefit if users feel compelled to manually review every decision to catch rare errors; it only adds to their workload. For AI to deliver value, workflows must adapt. If users cling to old habits, projects risk failure and wasted investment.

AI adoption is smoothest in tasks like transcription, where users see immediate time savings by editing AI-generated drafts. Conversely, in complex areas like clinical diagnostics or claims management, resistance is stronger due to high stakes and entrenched workflows. Efficiency gains occur when users shift from manually performing all work to reviewing AI’s initial output.

Finally, Ben points out that organizations ready to change are those struggling with tight service-level agreements (SLAs) or continually needing to hire more staff to keep up with demand. In contrast, if a minor, skilled team efficiently handles a task at costs similar to automation, implementing AI may not make sense; real opportunities for AI lie where time pressures or staffing challenges clearly demand new solutions.

Implementing De-Identification Protocols to Enable Responsible Innovation

Episode: Keeping the Patient Voice in De-Identified Data Models – with Brad Kennedy of Orlando Health

Guest: Brad Kennedy, Senior Director of Business Solutions Strategy, Orlando Health

Expertise: Healthcare Operations Transformation, Value-Based Care Strategy, Enterprise Technology Implementation

Brief Recognition: Brad has over 20 years of experience in healthcare strategy and operations. He currently serves as Vice President of Strategic Innovation at Orlando Health, where he leads enterprise transformation initiatives. He holds a Master of Healthcare Administration (MHA) from Texas A&M University.

In his podcast, Brad emphasizes a “least data necessary” approach, collecting only the minimal personal health information needed, often relying on unique identifiers rather than full details. It limits the risk if the data were ever exposed. Despite the precautions, he acknowledges that innovation, particularly in AI, depends on leveraging patient data responsibly. Therefore, robust de-identification processes are essential for safely advancing healthcare technologies.

Brad clarifies that the level of patient information needed varies based on the specific use case or study being conducted. For example, when working with medical imaging data, researchers typically don’t need to know the identity of the patient. They only need clinical context, like the condition being studied, to train AI models across thousands of image records. In such cases, no personal identifiers are necessary.

However, for other studies, especially those measuring outcomes or comparing across patient populations, specific demographic or clinical details become essential. These datapoints could include age, disease type, zip code, or other health attributes that help group patients into meaningful cohorts. These details enable researchers to compare “apples to apples” and test whether a new device or AI solution works across different subgroups.

That said, Brad emphasizes that even in these cases, identifiable data, such as names, phone numbers, or addresses, is rarely, if ever, used. The focus is on studying relevant attributes about the patient, not who the patient is. Balancing approaches that emphasize personal data as needed and total anonymity enables teams to maintain privacy while evaluating whether innovations are yielding meaningful outcomes.

He segues to cite an example of remote care, where instead of keeping someone in a hospital bed, wearables can allow patients to recover at home while still being continuously monitored.

The wearables transmit real-time data back to the care team, enabling timely interventions in the event of an issue. For example, if the device detects an incident or irregularity, the team can proactively contact the patient and direct them to the appropriate care.

Brad notes that the model he describes benefits both sides:

  • Patients can recover in the comfort of their own homes.
  • Healthcare systems help alleviate hospital capacity strain and enhance operational efficiency.

He then stresses that for wearables to work safely and ethically, the use of patient data must be communicated, obtained with consent, and safeguarded. When implemented correctly, this type of technology not only enhances patient outcomes but also supports a more agile and efficient care system.

Brad then explains that innovation must start by understanding the baseline: What is the current state of care? What metrics are we trying to improve? Without that, it’s hard to assess whether a new solution is genuinely compelling.

He advocates for a data-driven approach, comparing outcomes between control and test groups to determine if the technology is delivering value. But clinical results aren’t the only measure. Experience matters too:

  • Is the clinical workflow disrupted or improved?
  • Is the patient experience better?
  • Are patients using the tech as intended?

Brad brings up the example of wearable technology, which is increasingly being categorized by regulations as medical devices in many circumstances: 

“Take something as simple as a wearable — like a health-monitoring ring. It can be easily removed, which is why patient education is so critical. We need to clearly explain why continued use matters, what we’re monitoring, and how it ties into their broader care plan. Patients need to understand not just the immediate purpose, but also the long-term goals we’re working toward once they return home.

Ultimately, it comes down to consistent communication and follow-through — making sure patients and clinicians know why we’re introducing a new process, ensuring they’re comfortable with it, and reinforcing that commitment through action.”

—Brad Kennedy, Senior Director of Business Solutions Strategy at Orlando Health

Source link

#AIDriven #DeIdentified #Healthcare #Data #Means #Patient #Outcomes #Leaders #NLP #Logix #Orlando #Health