Understanding the cancer patient journey: How pharma can use NLP for deeper insights
Apr 3rd, 2025

Cancer patients and their caregivers often turn to online communities to share their experiences, seek advice, and express frustrations. These digital footprints provide a wealth of insight into the patient journey — but how can life sciences companies effectively tap into this information?
To explore this, we employed natural language processing (NLP) and large language models (LLMs) to analyze patient-generated content. The result? A deeper, unbiased understanding of the patient journey — one that traditional surveys and claims data often miss.
These insights are invaluable for pharmaceutical companies looking to improve patient support programs, refine educational materials, and strengthen their overall engagement strategies. By identifying the specific pain points patients face, companies can step in where it matters most.
We recently shared our findings at Pharma SOS 2025 during a session called “Advancing healthcare and pharmaceutical insights generation: Groundbreaking solutions using NLP.” The presentation sparked meaningful discussions about how companies can apply these insights to build more effective, patient-centric strategies.
Here’s how our approach helped illuminate the needs of patients and caregivers, and how these insights can help pharma shape better engagement strategies.
Traditional approaches fail to understand the patient journey
Pharma companies often rely on claims data to understand disease progression, but these datasets only tell part of the story. While they provide insights into when patients move from one treatment to another, they often fail to answer a critical question: Why?
Similarly, patient surveys tend to offer skewed perspectives. Since surveys rely on prompted questions, they may miss unexpected pain points. Patients may also feel pressured to answer in socially desirable ways or limit their responses to fit the scope of the survey. For companies seeking to better engage with patients and caregivers, this limited view can lead to less informed strategies.
The absence of unfiltered, authentic patient perspectives creates a gap in understanding. This is where natural language processing (NLP) offers a better solution.
Leveraging NLP for unbiased, authentic patient perspectives
One of our clients wanted to understand the disease journey of rare cancer patients through the eyes of patients and caregivers and needed a scalable, unbiased solution. They wanted to solve the challenges patients and their caregivers face when navigating the healthcare system, pinpoint where extra support could make a real impact, and gain insights to inform commercial, medical affairs, and field sales and marketing strategies.
To achieve this, we applied an NLP-based solution to analyze online conversations, where patients share their experiences through blogs, videos, and patient forums. By collecting and examining over 5.5 million words from 50,000 posts across 30 publicly available sources, we gathered data on approximately 1,200 patients. Ethical considerations guided our data collection, ensuring we only used openly accessible content and that the terms of service permitted the use of this data.
Unlike surveys that rely on predetermined questions, this method captured unbiased, spontaneous conversations, offering an authentic glimpse into the patient journey and revealing what matters most to patients without external influence.
Turning patient conversations into clear insights with NLP
To fully harness the power of this unstructured data, we employed large language models and other advanced tools. Rather than relying on simple keyword matching, our approach focused on understanding both the words and their context. This was essential for distinguishing similar terms, such as identifying whether “marker” referred to a biomarker or a pen.
One of the key techniques we used was named entity recognition (NER), which helped us standardize entities like diseases and procedures. We then extracted relationships between these entities and used assertion models to determine if events or conditions were past or ongoing, breaking complex discussions into standardized frameworks.
We also used these models to capture metadata, such as age, gender, and other patient characteristics, and engineered prompts to extract the broader themes in each post, uncovering the emotional landscape of each theme. The models not only identified primary and secondary emotions—like anger, fear, and frustration—but also provided justification for the assigned emotion based on the post’s content. This validation metric helped us understand the underlying reasons for emotions like frustration.
Overall, our approach offered a more nuanced view of patient challenges and experiences, providing richer insights than traditional claims data or surveys alone and delivering a deeper understanding of the patient journey.
Uncovering core issues from patient and caregiver conversations
Our analysis uncovered major themes representing the concerns of patients and caregivers, highlighting the emotional, physical, and logistical challenges they face during the patient journey. From frustrations with treatment efficacy and adverse effects to difficulties navigating the healthcare system, these themes shed light on the biggest barriers to receiving adequate care. The table below highlights the subthemes we identified related to patient and caregiver frustrations.
Subthemes around the frustration expressed by patients and their caregivers
Theme | Percentage of posts |
Treatment modalities | 31% |
Hematological management | 18% |
Navigating the healthcare ecosystem | 15% |
Constraints in knowledge and information | 14% |
Obstacles in support and communication | 11% |
Personal toll and symptomatic burden | 7% |
After identifying these major themes, we dug deeper into each subtheme to uncover specific frustrations. In looking at treatment-related frustrations, for example, several key issues emerged: patients voiced concerns about the efficacy and side effects of medications, challenges with alternative treatment, and delays in receiving therapy. We also identified frustrations related to the narrow range of treatment options, continued dependence on transfusions, and inadequate pain management.
Where pharma can step in
These insights provide a unique opportunity for pharma companies to address patient needs across the entire journey and disease progression. While our analysis doesn’t prescribe specific interventions, it highlights key areas where pharma can create the most meaningful impact for rare cancer patients—ranging from improving healthcare system navigation to closing critical knowledge gaps.
For instance, can manufacturers partner with healthcare providers to create clearer pathways for treatment access? Patient advocacy programs could play a better role in easing coverage and logistical challenges and provide more transparent information on insurance coverage, treatment centers, and financial assistance programs to reduce patient and caregiver frustration.
Similarly, with many patients struggling to find reliable information, how can pharma bridge this gap by providing accessible, educational content on treatment options, side effects, and disease management? One effective approach could be developing virtual patient assistance systems that deliver real-time, curated, and trustworthy answers to patient questions. By offering this on-demand support, pharma companies can ensure that patients have access to the most relevant, reliable information tailored to their specific needs—ultimately reducing uncertainty and empowering patients throughout their journey.
How NLP can transform patient engagement in pharma
Our analysis revealed not only patient and caregiver frustrations but also deeper insights into their emotional experiences, information gaps, and unmet needs—insights that traditional engagement surveys often miss. By leveraging LLMs, we uncovered valuable data that can drive more patient-centered initiatives, whether in streamlining the patient journey, enhancing support, or informing commercial strategy.
We were honored to share these findings at Pharma SOS and look forward to continuing the conversation on how the industry can turn these insights into meaningful action. For companies seeking to enhance their impact, the path is clear: Start by understanding what patients are truly saying.
Enhance patient insights with Definitive Healthcare
Discover how Definitive Healthcare can help pharma companies gain a deeper understanding of the patient journey. With our comprehensive data and insights, you can make more informed decisions, enhance patient support, and refine your commercial strategies. Start your journey today—sign up for a free trial and experience firsthand how our platform empowers you with the insights you need to fully understand the patient journey.