SDoH NLP and Tenasol

Social determinants of health NLP (SDoH NLP) is the use of natural language processing to detect aspects of patients that are peripheral to their medical status but have a large impact on their wellbeing.

depiction of SDoH NLP

Health outcomes are better in some places than others. We must learn why and what we can do. Created by Tenasol with public data.

A few points about SDoH & SDoH NLP:

  • SDoH is a broad concept. It refers to peripheral details of social, biological, behavior, and physical environment traits. Examples may include food insecurity, income level, veteran status, and language barriers.

  • SDoH NLP is not easy. Information about a patient’s day-to-day is surprisingly uncommon, and usually only placed within unstructured sections which is difficult to parse. Going further, quantifying and categorizing these concepts is difficult due to breadth of the concept of SDoH. Tenasol uses a blend of SDoH NLP, structured data extraction, and public data source connections to meet our client needs. Contact us for more information.

  • SDoH seriously matters. It is no question in population health research that people of differing race, ethnicity, or incomehave differing experiences with the US healthcare system. The federal government knows this and actively participates in resolving these gaps.

  • SDoH data can come from anywhere: Data doesn’t need to just come from a medical record - it can come from public sources like NOAA and census data. See our healthcare data sources blog for more details.

Why SDoH NLP Matters

While there are some categorical ICD-10-CM (specifically Z-category codes), LOINC, SNOMED, and HCPCS codes associated with social factors, they are rarely documented as they are not considered billable. Stated another way, the practitioner is not incentivized to record these SDoH diagnosis codes because the patient will not directly be treated better, nor will the practitioner be paid more.

Instead, the data ends up in unstructured text summaries written by the practitioner or in forms which must be manually reviewed for SDoH information or parsed by SDoH NLP.

SDOH NLP can detect a PHQ-9 form

PHQ-9 forms identify depression but are rarely represented in structured digital content, but rather are found in unstructured content.

Entities that benefit from SDoH NLP include:

  • Research Institutions: Academia is constantly researching what impacts who and how much. Any data gleaned from bulk medical data using SDoH NLP fits the bill.

  • Government: Every branch of the government is keen on providing equal benefits for all as a function of health status. If a sub-population is not receiving this, the federal government will create programs/policy to incentivize that to change.

  • Health Plans: Similar to how we discussed in our HEDIS blog, health plans seek to have public high star rating as it impacts their bottom line. If a plan is taking care of more patients who are at a disadvantage, they are rewarded for it. Currently, the federal government via social risk factors (SRF) identified in the Health Equities Index (HEI) incentivizes plans to take on members who are low-income, dual eligible (qualifying for both Medicare and Medicaid), and/or disabled as it effects their star rating as well.

When paired with hard medical data such as diagnosis SDoH NLP data becomes even more powerful.

SDoH NLP and HL7 Gravity

Gravity project is driving SDOH NLP

The HL7 organization, responsible for HL7 FHIR, CDA, and ADT, also has strong interests in making sure that SDoH data is detectable, transmittable, quantifiably, and categorical. They therefore created the HL7 Gravity initiative to better document where this data exists and how to better structure it.

This supports SDoH NLP systems by allowing them to better parse unstructured data, know what they are looking for, and to report it in more standardized and structured ways. It also informs SDoH NLP systems by better knowing what to look for.

Here is a link to the Gravity Project confluence.

SDoH NLP and Tenasol

SDOH NLP extraction

Tenasol participates in SDoH NLP via our data extraction pipeline that is capable of extracting SDoH in a quantifiable way from unstructured and structured data. Our SDoH NLP system pulls data from:

  • Patient demographics

  • LOINC, ICD-10-CM, HCPCS, and SNOMED terms

  • Indirectly inferred from medical records with machine learning and AI

  • Unstructured client-specified SDoH NLP targets using machine learning.

  • Structured or unstructured SDoH patient-completed evaluation forms using machine learning.

  • External sources linked to patient data including:

    • Air quality

    • Weather conditions

    • Census data (including transport and financial status among others for local area

  • ….and others

Once Tenasol extracts this data, we store it in an internal proprietary format and can transform it to CSV, XLSX, FHIR representation via API, or in some cases custom formats depending on client preferences, removing unnecessary data elements beforehand.

Tenasol SDoH NLP use cases

SDoH NLP use cases

Tenasol has demonstrated the use of data derived from SDoH NLP for:

  • Generalized SDoH extraction for client interested in using that data for their own research purposes.

  • Support of star-rating analysis for health plans seeking to better understand their populations

  • Condition correlations for research analysis of conditions.

  • Sensitivity Analysis for research associated with investment decisions at the local level in areas such as education and transportation.

  • Visualization for site placement decision support or general informatics, as well as combining with geospatial AI for more detail.

Reach out for more information.

Conclusion

Social determinants of health NLP transforms unstructured and structured data into actionable insights, enabling healthcare stakeholders to address disparities in patient care. At Tenasol, we specialize in extracting and codifying SDoH data through advanced machine learning techniques, leveraging sources such as patient demographics, medical coding standards like LOINC and ICD-10-CM, and public datasets like census data, air quality, and weather conditions. By incorporating both client-specified SDoH targets and patient-completed evaluation forms, we bridge critical gaps in healthcare data analysis.

Our commitment to SDoH NLP aligns with initiatives like the HL7 Gravity Project, ensuring standardized, quantifiable insights that drive meaningful change. Whether aiding research institutions, government programs, or health plans, Tenasol provides tools and visualizations that empower data-driven decision-making. Together, we can better understand the factors that shape health outcomes and work to build a more equitable healthcare system.

Contact us to learn more!

Previous
Previous

Medical Record Deidentification NLP

Next
Next

Understanding Healthcare Interoperability