Danny Platform on ASCO JCO Clinical Cancer Informatics

In this era of value-based care, digitalization and accurate data extraction of unstructured text from multilingual EHRs is vital and presents a real challenge for all stakeholders working on advancing clinical research and improving patient care.

Sqilline developed an algorithm that was acknowledged and published in the ASCO Journal of Clinical Oncology Clinical Cancer Informatics on Oct 02, 2019.

The manuscript “Clinical data extraction and normalization of Cyrillic electronic health records (EHR) via deep-learning natural language process (NLP)” was written by Sqilline’s R&D and Chief Data Scientist – Dr. Boyang Zhao, Ph.D.

In the study, Sqilline focused specifically on Bulgarian EHRs, where the native language has non-Latin alphabet and to automatically and properly extract the biomarker status of patients with breast cancer, we developed an algorithm utilizing deep-learning natural language processing. The ASCO publication provides examples on the biomarker status for patients with breast cancer – Estrogen receptor (ER), Progesterone receptor (PR), and Human epidermal growth factor receptor 2 (HER2).

The challenges of extraction included: different terms in both languages; misspellings; multiple variants of the same word; various positions of the value relative to the target parameter; changeable lengths of the parameter; human errors in labeling and substantial unbalanced data sets for certain parameter values.

By using several techniques that incorporate dual-word embeddings encoding syntactic and polarity information in two languages with embedding space alignment, followed by deep neural network architectureDanny Platform can correctly extract and normalize the biomarker statuses for patients with breast cancer. The technology demonstrates that applying deep-learning NLP models based on convolutional neural networks (CNNs) or recurrent neural networks are much more superior to classical machine learning algorithms. The joined architecture delivered much higher-performance retrieval of the biomarker status of patients with breast cancer from unstructured medical data despite the above challenges.

Sqilline aims to develop novel approaches in natural language processing (NLP) for the extraction and normalization of unstructured data from electronic health records (EHRs) containing both English and Cyrillic text and to apply them to other mixed-language medical text data (e.g., EHRs in Russian, Ukrainian, or Serbian).

The adaptation of artificial intelligence (AI) into healthcare is faced with the major barrier of the tremendous volume of unstructured text that exists in EHRs. Of course, this is not the only hurdle, there are also tasks like drug treatment durations and therapy effectiveness, but the Sqilline approach is significant and well deserves its recognition by ASCO.

The technology in Danny Platform could be vital for proper measurement of treatment outcomes for value-based care with the achievement of high accuracy in medical data extraction and normalization. 

Link to ASCO publication: 

https://ascopubs.org/doi/pdf/10.1200/CCI.19.00057#.XZXRQD13b2c.email

Share this article:

More News & Highlights

News

Regulation and Innovation in Clinical Trials

The landscape of clinical trials is undergoing a profound transformation. On one hand, regulatory changes such as the full implementation of the Clinical Trials Regulation...

Read more...

News

Скиллайн стартира проект за внедряване на иновации за автоматизирано генериране на здравни данни с подкрепата на Европейския съюз

На 27.11.2024 г. “СКИЛЛАЙН БИЗНЕС СЪЛЮШЪНС” ООД подписа АДБФП № BG16RFPR001-1.003-0291-C01 „Внедряване на иновация в Скиллайн Бизнес Сълюшънс ООД" по Програма "Конкурентоспособност и иновации в...

Read more...

News

Automating Oncology Data Structuring Using LLMs on Danny Platform

Over the past seven years, Danny Platform has emerged as a leading solution for structuring and validating extensive oncology electronic health record (EHR) data. Through...

Read more...