Investigate the use of natural language processing (NLP) techniques to extract relevant information from clinical notes and identify diseases

Authors

  • Saif Mohammed Khan Department of Engineering Management, Christian Brothers University, Tennessee, USA. Email:fsaifmoh@cbu.edu Author
  • Bin Ibrahim Ismail Department of Engineering Management, Christian Brothers University, Tennessee, USA. Email: fismailb@cbu.edu Author
  • Samad Abdul Department of Engineering Management, Christian Brothers University, Tennessee, USA. Email: sabdul@cbu.edu Author
  • Shaikh Abdul Sattar Department of Engineering Management, Christian Brothers University, Tennessee, USA.Email: fshaikha@cbu.edu Author

Keywords:

Natural Language Processing (NLP), Clinical Notes, Disease Identification, , Named Entity Recognition (NER), Term Frequency-Inverse Document Frequency (TF-IDF)

Abstract

Mounting volumes of unstructured clinical data creates a major hurdle for health systems looking to tap this information to improve patient care. Natural Language Processing (NLP) holds much potential in turning all of this unstructured data into useful knowledge. In this work, we delve into the application of NLP for identifying diseases by pulling out pertinent content present in clinical notes. To elaborate on this, in the current research we aim to improve disease identification from clinical text by implementing state-of-the-art NLP methods like TF-IDF, named entity recognition (NER) and deep learning models. A huge dataset of clinical notes and multiple different NLP algorithms were used to test their efficiency at recognizing disease-related information. Our results demonstrate that NLP can increase the detection of diseases from clinical notes, and thus may be instrumental in a more timely or even improved diagnosis and plan for treatment. This proof-of-concept study suggests a significant potential for the application of NLP to pre-processing and unstructured-to-structured data integration in clinical analysis, while also underlining an obvious requirement for additional research to be put into optimizing natural language processing algorithms to fit practical medical purposes.

Downloads

Published

2024-08-15