docx

School

Nairobi Institute of Technology - Westlands *

*We aren’t endorsed by this school

Course

SCH 306

Subject

Health Science

Date

Nov 24, 2024

Type

docx

Pages

19

Uploaded by UltraLorisPerson1117

Report
Text and Audio Classification Enabled Diagnosis for Treatment Applications by Natural Language Processing (NLP) and Deep Learning (DL) Dissertation Proposal Submitted to National University School of Technology and Engineering in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF DATA SCIENCE by Thank you for sharing your first draft of Chapter 1 with the Doctoral Record High Committee. You have many great ideas, and you have done quite a bit of detailed research here. I am the Subject Matter Expert (SME), and I have attached my recommendations and comments on your paper below. Please read through them. Your paper has sections that refer to a survey review of the timeline of adaptation of ML processes. However, by reading some other sections of your RQs and your final summary, I am convinced that you want to complete constructive research on secondary text and audio data and compare the models' diagnostic effectiveness. You need to clean up the paper so it has only one focus and direction, and then you need to research what methodologies you plan to use. i
Table of Contents Chapter 1: Introduction .............................................................................................................. 1 Natural Language Processing (NLP) and Text for Healthcare .............................................. 1 Statement of the Problem ....................................................................................................... 3 Purpose of the Study .............................................................................................................. 4 Introduction to the Theoretical Framework ........................................................................... 5 Introduction to Research Methodology and Design ............................................................... 6 Research Questions ................................................................................................................ 7 RQ1 .................................................................................................................................... 7 RQ2 .................................................................................................................................... 7 Hypotheses ............................................................................................................................. 7 H1 0 ...................................................................................................................................... 8 H1 a ...................................................................................................................................... 8 H2 0 ...................................................................................................................................... 8 H2 a ...................................................................................................................................... 8 Significance of the Study ....................................................................................................... 8 Definition of Key Terms ........................................................................................................ 9 Summary .............................................................................................................................. 10 References ................................................................................................................................ 11 ii
List of Table No table of figures entries found. iii
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
List of Figures No table of figures entries found. iv
Chapter 1: Introduction Technological improvements are bringing about a transformational age in the healthcare sector, and Natural Language Processing (NLP) and Deep Learning (DL) are at the forefront of these developments. NLP has the potential to completely change how doctors diagnose ailments, provide treatments, and communicate with patients. There have also been improvements in the healthcare sector documentation with the adoption of the Electronic Healthcare Record and digital imagery solutions, improvements that have provided the data necessary to improve medical outcomes and streamline workflows. This research study will focus on Healthcare NLP and DL. Medical text and audio classification may improve medical treatment and diagnosis applications, reducing morbidity and mortality. Kobritz et al. (2023) suggest that "Machine- learning algorithms show promise in improving identification of complications." This research analyzes the application of Machine Learning (ML) using healthcare text and audio data to classify disease. This chapter provides an overview of the clinical application of NLP and audio classification in treatment and diagnosis to explicate its importance in the medical sector. The problem statement will address the gap in the previous research in this field and help improve effectiveness in meeting the requirements of treatment and diagnosis applications of big data in clinical data diagnosis. This chapter provides the background, problem, purpose, variables, population, sample, and conceptual framework for this research. This chapter also develops research hypotheses, questions, and significance concerning the research topic. The healthcare sector increasingly depends on cutting-edge technology to improve patient care, reorganize clinical processes, and boost diagnostic precision. NLP provides essential tools for this setting ( Johri et al., 2021 ). NLP is a set of methods for processing unstructured text. Studies indicate that implementing NLP in 1
healthcare may improve medical diagnosis (Healthcare, 2020) because there is a gradual increase in the potentiality of the health systems for interpreting, analyzing, and searching large quantities of patient information. Alan (1999) found that "the impact of NLP on information retrieval tasks has largely been one of promise rather than substance" (p. 99). Decision-making capabilities have been greatly enhanced in the medical sector by considering the NLP set of methods, and this is done by taking clinical notes datasets with text transcriptions (.csv, text data) and labeling according to the ailment category. NLP techniques are only one ML category that might be used for reducing morbidity and mortality in healthcare. Deep learning techniques might be applied to clinical audio to extract diagnoses. Fagherazzi et al. (2021) reference that in order to have control over the recorded vocal task but to allow patients to choose their own words to preserve their naturalness, semi-spontaneous voice tasks are designed where the patient is instructed to talk about a particular topic (e.g., picture description or story narration task). To fully grasp the importance and immediacy of this research, it is crucial to situate it within the broader context of the field. In the case of the previous example, this involves placing the research within the intersection of healthcare, technology, and data analytics. The interface of these domains is where the proposed study, centered around text and audio classification for diagnosis, gains its significant importance and current relevance. For example, applying NLP and DL to medical diagnosis has significant practical implications. The capacity to accelerate and improve diagnostic accuracy in the healthcare domain substantially impacts patient care and outcomes. In addition, the applied significance of the research topic is relevant. This study summarizes previous research on utilizing NLP and DL in the healthcare industry and how they have influenced diagnostic procedures to date and then leverages these tools on an existing dataset to generate classification models. This acquaints the audience with the 2
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
current state of the art and emphasizes the basis of the proposed research. The progress and gaps identified in previous studies highlight the importance of addressing the research problem. The text will begin by providing a thorough and thought-provoking overview of the research domain and its difficulties. As the introduction unfolds, it will gradually focus on particular research areas and areas lacking knowledge. By providing a clear overview of these challenges and obstacles, the transition to the problem statement will flow smoothly. Statement of the Problem The problem to be addressed in this study is the difficulty experienced in medical diagnosis and treatment due to the absence of trustworthy tools for analyzing textual and auditory data in healthcare settings (Stark et al., 2018; O'Cathain et al., 2020). In modern healthcare, when data is abundant, not using it causes delays, blunders, and missed opportunities for early action. Current diagnosis and treatment methods rely on human interpretation, making medical record and audio recording management complex and unpredictable. Healthcare personnel, patients, organizations, and society are affected by this issue. The problem of exhaustion and a drop in care quality is evidenced by the difficulties experienced by healthcare providers in processing massive amounts of textual and auditory data (Stark et al., 2018). Consequently, patients face challenges obtaining prompt and accurate diagnoses, resulting in subpar treatment outcomes. The rising costs and potential legal dangers healthcare organizations face are significant factors in escalating healthcare prices and deteriorating social well-being. The issue impacts patients, healthcare workers, and society. Inefficient medical data analysis causes patient suffering, treatment delays, higher healthcare expenses, and legal issues. The total stakeholder influence and complexity of variables preventing data use are unclear must be clarified . Neglecting the issue risks patient suffering, higher costs, and missed 3
early intervention. The study highlights will highlight the importance of answering these questions to enhance healthcare by decreasing unintended consequences and improving medical data analysis. NLP and DL approaches in medical diagnosis and therapy will be examined using qualitative research methods and a case study design (Ritter, 2021; Stark et al., 2018; O'Cathain, 2020). This research develops cutting-edge NLP and DL models to reduce disease diagnostic and treatment inefficiencies and improve healthcare for all stakeholders. Purpose of the Study This quantitative correlational research design aims to build symptom classifier models to support patient diagnoses and treatment using NLP and DL techniques based on text and audio data. This project intends to address the inefficiencies indicated in the problem statement by employing advanced technologies to streamline healthcare diagnostic and treatment processes. The problem outlined; namely the underuse of textual and audio data in healthcare that leads to delays in diagnosis and incorrect treatment, is addressed head-on in this study ( Ritter, 2021) by developing models that leverage both text and audio to classify patient symptoms, The study will progress in stages, beginning with data curation and ending with usable NLP and DL models for classifying patient symptoms. Multi-input DL architecture will classify audio and text elements and return the likely nature of the patient's symptoms. The dependent variable is the nominal list of possible patient symptoms, while the independent variables are the audio and text inputs. The possible effects on healthcare providers, patients, and institutions will also be investigated. Patients in healthcare settings and the medical staff who care for them are the primary audiences of the proposed proposal. The study will leverage open-source data, including 4
5,385 training observations (audio and text) with labels, 381 validation observations, and 385 test observations. The study will be performed at the researcher's home using publicly available, anonymous data. Introduction to the Theoretical Framework This study's theoretical frameworks are the Field Theory of Health Services and the Theory of Diffusion of Innovations, two interrelated but distinct bodies of knowledge. Everett Rogers' Theory of Diffusion of Innovations (Curtis, 2020) provides a foundational theoretical framework for understanding how new ideas and technologies move throughout a society. The study focuses on how advancements in natural language processing (NLP) and deep learning (DL) are spreading throughout the healthcare industry. Therefore, this is of particular relevance. Understanding how healthcare practitioners and organizations might accept and implement NLP and DL advancements for medical diagnosis and treatment is made possible by Rogers' approach. Understanding the complex dynamics surrounding integrating these technologies into healthcare practices is facilitated by the fundamental ideas of this theory, such as the innovation-decision process, adopter categories, and communication channels. For a more all-encompassing view of the healthcare system and all its intricacies, consider the insights into healthcare policy dynamics and comparative analysis provided by Howard Leichter's Field Theory of Health Services (Kozowska & Sikorski, 2021). The relationships between healthcare practitioners, organizations, policies, and patients are central to this paradigm. By adopting Leichter's approach, the study acquires a deeper appreciation for the multifaceted environment in which NLP and DL developments are deployed in the healthcare system. 5
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Decisions about the study's research are based on integrating these two theoretical models. The issue statement is better shaped by considering the sector's complexity and the demand for novel approaches. The statement of purpose is congruent with these frameworks since it emphasizes the potential spread of NLP and DL in the healthcare industry, propelled by the novelty of these technologies themselves (Bianchini et al., 2020). These unified models also guide the research topics concerned with the spread of NLP and DL innovations inside the complex healthcare system. Combining these theoretical frameworks paints a fuller picture of the diffusion of NLP and DL advances in healthcare, and this research technique guarantees that research decisions are grounded in theory. The integrated approach adds a more profound knowledge of the advantages and downsides of implementing cutting-edge technologies into healthcare practices. Introduction to Research Methodology and Design The nature of this constructive research is to study the medical industry segment. It utilizes a data collection method by interviews with the sampling method for data collection. The study aims to extensively scrutinize complex matters associated with applying DL and NLP technologies in the health sector. Moreover, the applied approach is grounded in the quantitative paradigm and coheres well with most fundamental research studies developed for ML and DL. This study has been greatly informed by the outstanding works of Creswell and Stake about the ML and the DL approaches (Verleye, 2019). Consequently, their perspectives justify the choice of the quantitative research strategy adopted herein. Although this section does not go into great detail, their fundamental work suggests that a quantitative analysis of the challenge of integrating NLP and DL for healthcare purposes is well justified and valid. 6
A quantitative method has been chosen based on the above-mentioned influential publications that support this choice, aligned with the goals for the study, as well as the specified problem and purpose of the interviews with patients. Thus, this choice of methodology will help examine the complex interaction between NLP and DL in healthcare settings with specific attention to the challenges and opportunities involved at the individual level. The particular option of a case study design and a quantitative methodology is most suitable for this investigation, and this will provide a methodologically sound foundation for detailed investigations into the effectiveness of NLP and DL approaches within the healthcare environments that fall within the scope of this research. All this shows why it is essential to adhere deliberately to these approaches aimed at answering the fundamental questions of this dissertation proposal and having a robust background for deeper analysis. The nature of the study will provide care without directing human involvement, lowering healthcare costs, reducing morbidity and mortality, and enhancing care quality. Research Questions Research Questions (RQs) are the guiding queries that frame the inquiry, and they should be crafted to correspond precisely with the problem statement and objective of the study. RQ1 How effective is NLP in classifying patient symptoms from text data? RQ2 How effective is NLP in the classifying of patient symptoms from audio data? Hypotheses Based on the research objectives, the hypothesis for this research is as follows: 7
H1 0 Text analysis of patient symptoms results in precision and recall insufficient for provider decision support. H1 a Text analysis of patient symptoms results in precision and recall sufficient for provider decision support. H2 0 Audio analysis of patient symptoms results in precision and recall insufficient for provider decision support. H2 a Audio analysis of patient symptoms results in precision and recall sufficient for provider decision support. Significance of the Study The study will demonstrate how NLP and DL might provide decision support to providers. Looking at the context of the problem statement, it has been identified that due to the presence of unstructured medical data, there is an increase in the complexity of physicians' work in making proper documentation of clinical notes (Deshmukh, 2023). As a result, there is a gap in ethical perspectives in meeting the patients' expectations in delivering accurate clinical decisions. This research will prove NLP/DL's promise for patient diagnosis. This research has far-reaching implications, both in healthcare and in the larger areas of NLP and DL, for which it provides a foundational foundation. This study holds value for both applied and academic fields. First and foremost, this research addresses a significant issue in healthcare by utilizing the power of NLP and DL to advance symptom classification and, eventually, 8
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
medical diagnosis and treatment. Patient care might be significantly enhanced if the findings of this study were implemented. Diagnostic errors would be reduced, and treatment decisions would be made more quickly ( Wu et al., 2020). In addition, the results of this research add to the growing body of work exploring NLP and DL's potential in the medical field. Exploring their use in healthcare settings adds to the existing body of information. This research contributes to the academic discussion on incorporating innovation in healthcare since it follows the theoretical frameworks of the PhD's Theory of Diffusion of Innovations and the Field Theory of Health Services ( Fahy et al., 2020). The study's findings will provide essential insights into the theoretical underpinnings underlying the dynamics of innovation dissemination in a complex healthcare environment. In conclusion, this research is noteworthy because of its possible sound effects on healthcare, increased knowledge of NLP and DL's practical applications, and development of theoretical frameworks. Definition of Key Terms 1. Natural Language Processing (NLP) The study of how computers and humans communicate is known as Natural Language Processing and falls under the umbrella of AI (GOYAL, 2023). It involves the creation of algorithms and models to enable computers to understand, interpret, and generate human language. In the context of this research, NLP means using these methods to analyze and act upon textual medical records for the objectives of diagnosis and therapy. 2. Deep Learning (DL) Artificial neural networks are the focus of Deep Learning, a subfield of machine learning and AI that aims to model and solve complex problems (Castiglioni et al., 2021). DL approaches are characterized by numerous layers of artificial neurons, enabling them to learn patterns and 9
features from data automatically. This research aimed to improve the accuracy of medical diagnosis and treatment by using deep neural networks to interpret and process audio data. Summary This study addresses the problem of difficulty experienced in medical diagnosis and treatment due to the absence of trustworthy tools. Its purpose of quantitative correlational research design aims to build symptom classifier models to support patient diagnoses and treatment using NLP and DL techniques based on both text and audio data., leveraging NLP/DL techniques to summarize patient symptoms and support provider decision making. The significance of this study is that it pushes the envelope of diffusion for ML methods that support decision-making around the caregiving process. Investigating the ability of NLP/ML algorithms to provide precise and sensitive classification is an essential part of its diffusion into the medical sector. 10
References Bianchini, S., Müller, M., & Pelletier, P. (2020). Deep learning in science.   arXiv preprint arXiv:2009.01575 . Bose, P., Srinivasan, S., Sleeman IV, W. C., Palta, J., Kapoor, R., & Ghosh, P. (2021). A survey on recently named entity recognition and relationship extraction techniques on clinical texts. Applied Sciences , 11 (18), 8319. https://doi.org/10.3390/app11188319 Braşoveanu, A. M., & Andonie, R. (2020, September). Visualizing transformers for nlp: a brief survey. In   2020 24th International Conference Information Visualisation (IV)   (pp. 270-279). IEEE. https://ieeexplore.ieee.org/abstract/document/ 9373074 / Castiglioni, I., Rundo, L., Codari, M., Di Leo, G., Salvatore, C., Interlenghi, M., ... & Sardanelli, F. (2021). AI applications to medical images: From machine learning to deep learning.   Physica Medica ,   83 , 9-24. Categorizing patient concerns using natural language processing techniques. BMJ health & care informatics , 28 (1). https://doi.org/10.1136%2Fbmjhci-2020-100274 Curtis, M. (2020). Toward understanding secondary teachers' decisions to adopt geospatial technologies: An examination of Everett Rogers' diffusion of innovation framework.   Journal of Geography ,   119 (5), 147-158. Deshmukh, S. S. (2023, June). Progress in Machine Learning Techniques for Stock Market Movement Forecast. In Proceedings of the International Conference on Applications of Machine Intelligence and Data Analytics (ICAMIDA 2022) (Vol. 105, p. 69). Springer Nature. https://doi.org/10.2991/978-94-6463-136-4_9 Dobbins, N. J., Mullen, T., Uzuner, Ö., & Yetisgen, M. (2022). The Leaf Clinical Trials Corpus is a new resource for query generation from clinical trial eligibility criteria. Scientific Data , 9 (1), 490. https://doi.org/10.1038/s41597-022-01521-0 11
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Fagherazzi, G., Fischer, A., Ismael, M., & Despotovic, V. (2021). Voice for Health: The Use of Vocal Biomarkers from Research to Clinical Practice. Digital Biomarkers, 5(1), 78–88. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8138221/ Fairie, P., Zhang, Z., D'Souza, A. G., Walsh, T., Quan, H., & Santana, M. J. (2021). Fahy, N., Greenhalgh, T., & Shaw, S. (2020). PHOENIX: A new framework for applying psychological theories to the adoption of innovations by healthcare professionals. GOYAL, A. A. (2023). The Role of Machine Learning in Natural Language Processing and Computer Vision. Healthcare, (2020). The role of natural language processing during the COVID-19 pandemic ... https://mdpi-res.com/d_attachment/healthcare/healthcare-10-02270/ article_deploy/healthcare-10-02270-v2.pdf?version=1668744020 Hisamitsu, T., Oikawa, M., & Kido, K. (2016). Care cycle optimization using digital solutions.  Hitachi Review 65 (9), 399.  https://www.hitachi.com/rev/archive/2016/r2016_09/pdf/r2016_09_105.pdf Iroju,   O.   G., & Olaleke,   J.   O. (2015). A systematic review of natural language processing in healthcare.   International Journal of Information Technology and Computer Science ,   7 (8), 44-50.   https://doi.org/10.5815/ijitcs.2015.08.0 7 Johri, P., Khatri, S. K., Al-Taani, A. T., Sabharwal, M., Suvanov, S., & Kumar, A. (2021). Natural language processing: History, evolution, application, and future work. In Proceedings of 3rd International Conference on Computing Informatics and Networks: ICCIN 2020 (pp. 365-375). Springer Singapore. https://doi.org/10.1007/978-981-15-9712-1_31 Kang, Y., Cai, Z., Tan, C. W., Huang, Q., & Liu, H. (2020). Natural language processing (NLP) in management research: A literature review.   Journal of Management 12
Analytics ,   7 (2), 139-172. https://www.tandfonline.com/doi/abs/10.1080/23270012.2020.1756939 Kobritz, M., Patel, V., Rindskopf, D., Demyan, L., Jarrett, M., Coppa, G., & Antonacci, A. C. (2023). Practice-Based Learning and Improvement: Improving Morbidity and Mortality Review Using Natural Language Processing. Journal of Surgical Research, 283, 351–356. https://doi.org/10.1016/j.jss.2022.10.075 Kozłowska, U., & Sikorski, T. (2021). The Implementation of the Soviet Healthcare Model in 'People's Democracy'Countries—the Case of Post-war Poland (1944–1953).   Social History of Medicine ,   34 (4), 1185-1211. Kruse,   C.   S., Goswamy,   R., Raval,   Y., & Marawi,   S. (2016). Challenges and opportunities of big data in health care: A systematic review.   JMIR Medical Informatics ,   4 (4), e38.   https://doi.org/10.2196/medinform.5359 Leichter, H. (1979). A Comparative Approach to Policy Analysis Health Care Policy in Four Nations. Cambridge University Press. Li, I., Pan, J., Goldwasser, J., Verma, N., Wong, W. P., Nuzumlalı, M. Y., Rosand, B., Li, Y., Zhang, M., Chang, D., Taylor, R. A., Krumholz, H. M., & Radev, D. (2022). Neural natural language processing for unstructured data in electronic health records: A review.  Computer Science Review 46 , 100511.  https://doi.org/10.1016/j.cosrev.2022.100511 Mishra, S. B., & Alok, S. (2022). Handbook of research methodology. https://www.researchgate.net/publication/319207471_HANDBOOK_OF_RESEARC H_METHODOLOGY?enrichId=rgreq-6be5390a6f24699c882b5c3de1cc9f78- XXX&enrichSource=Y292ZXJQYWdlOzMxOTIwNzQ3MTtBUzo3MTQ4NTgxND AwMTY2NDJAMTU0NzQ0Njg3MDk1Mg%3D %3D&el=1_x_2&_esc=publicationCoverPdf 13
Moorhead, L. (2021, June 17). Resize multiple images to be the same size. Miro. https://community.miro.com/ask-the-community-45/resize-multiple-images-to-be-the- same-size-5101 Nawab, K., Ramsey, G., & Schreiber, R. (2020). Natural language processing to extract meaningful information from patient experience feedback. Applied Clinical Informatics , 11 (02), 242-252. 10.1055/s-0040-1708049 O'Cathain, A., Connell, J., Long, J., & Coster, J. (2020). 'Clinically unnecessary of emergency and urgent care: A realist review of patients' decision making.   Health Expectations ,   23 (1), 19-40. Ritter, E. (2021). Your Voice Gave You Away: The Privacy Risks of Voice-Inferred Information.   Duke LJ ,   71 , 735. Rusk, N. (2016). Deep learning. Nature Methods, 13(1), 35–35. https://doi.org/10.1038/nmeth.3707 Shilo,   S., Rossman,   H., & Segal,   E. (2020). Axes of a revolution: Challenges and promises of big data in healthcare.   Nature Medicine ,   26 (1), 29-38.   https://doi.org/10.1038/s41591- 019-0727-5 Smeaton, A. F. (1999). Using NLP or NLP Resources for Information Retrieval Tasks. Text, Speech and Language Technology, 99–111. https://doi.org/10.1007/978-94-017-2388- 6_4 Stark, Z., Lunke, S., Brett, G. R., Tan, N. B., Stapleton, R., Kumble, S., ... & Melbourne Genomics Health Alliance. (2018). Meeting the challenges of implementing rapid genomic testing in acute pediatric care.   Genetics in Medicine ,   20 (12), 1554-1563. Strijker, D., Bosworth, G., & Bouter, G. (2020). Research methods in rural studies: Qualitative, quantitative, and mixed methods. Journal of Rural Studies , 78 , 262- 270. https://doi.org/10.1016/j.jrurstud.2020.06.007 14
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Tang, R., Chuang, Y. N., & Hu, X. (2023). The science of detecting llm-generated texts.   arXiv preprint arXiv:2303.07205 . https://arxiv.org/abs/2303.07205 Universal Rules for Fooling Deep Neural Networks based Text Classification. (n.d.). Ieeexplore.ieee.org. Retrieved October 24, 2023, from https://ieeexplore.ieee.org/abstract/document/8790213 Verleye, K. (2019). Designing, writing-up and reviewing case study research: an equifinality perspective.   Journal of Service Management ,   30 (5), 549-576. Wu, S., Roberts, K., Datta, S., Du, J., Ji, Z., Si, Y., ... & Xu, H. (2020). Deep learning in clinical natural language processing: a systematic review. Journal of the American Medical Informatics Association ,   27 (3), 457-470. 15