A recent publication in Nature raised the question: “What prevents us from reusing medical real-world data in clinical research?”. Medical real-world data (RWD) stored in clinical systems represents a valuable knowledge source for medical research, but its usage is still challenged by various technical and cultural aspects. Secondary use of RWD is a great opportunity to immediately enable comprehensive and meaningful medical data science (MDS) analyses.
Reusing medical real-world data for medical data science: Can technology alone address the challenges?
The short answer is “no”! The primary objectives in facilitating, or even enabling, the utilization of medical RWD for research involve promoting interoperability, harmonization, data quality, and ensuring privacy. It also includes optimizing the retrieval and management of patient consent, as well as establishing guidelines for data use and access. These initiatives are designed to tackle the diverse challenges associated with scientifically repurposing routine clinical data, as outlined in Nature.
A number of sources and consequences of uncertainties lead to significant challenges in the reuse of medical RWD. These uncertainties can be attributed to specific roles and addressed with appropriate measures to mitigate their impact. Implementing possible technical changes may seem easier than addressing the cultural challenges listed above.
Technical challenges of curating medical RWD sets and possible measures for improvement
When describing the challenges resulting from balancing benefits and harms in MDS projects, some measures were suggested by Nature that require technical solutions. One example for this is the implementation of data protection measures like data access control, safe data transfer, encryption, or de-identification. However, there are not only technical solutions but also challenges.
The lack of structured data is one of the main technical challenges in curating medical RWD sets. Automated data structuring is a practical measure to improve this situation.
‘DataStruct’ and not only…
Unstructured (free text) data is immensely valuable to healthcare. “Working with thousands of hospital discharge summaries and clinical notes is like talking with and learning from thousands of unknown healthcare professionals,” said Desislava Mihaylova, CEO, and Founder of Sqilline. “And the beauty is that it’s all done by the machine.” At Sqilline, we enable researchers and clinicians to learn from the experiences and treatment outcomes of every single patient. Automated AI-enabled data structuring is a key part of our data analytics ‘Danny Platform‘.
‘Danny Platform’ serves as our analytics engine, aggregating and harmonizing various heterogeneous data sources (EHRs, lab tests, registries, etc.) in digitalized and structured medical records to support informed clinical decisions and research. The structured medical data can be directly utilized by the physicians in charge or accessed in anonymized and aggregated manner by external stakeholders beyond а specific medical institution. It provides comprehensive searches, in-depth analyses, predictions, and treatment solutions and decision support to physicians, researchers, and payers.
The unified structures created by the ML/NLP algorithms allow for the creation of patient cohorts consisting of patients meeting specific inclusion and exclusion criteria defined by the customer, forming the foundation of our three key solutions: (1) Danny DataStruct, (2) Danny Decision Support, and (3) Danny e-Clinical Research.
Utilizing ‘Danny Platform’, along with its integrated ML/NLP algorithms, a comprehensive ‘data lake’ of health information is established. This repository facilitates pioneering research in medicine, uncovering real-world data and previously unnoticed connections between patient symptoms, diagnoses, genetic profiles, and treatments.