Overview
The purpose of this study is to highlight the usefulness of artificial intelligence and machine learning to develop computer algorithms that will achieve with great reliability, speed and accuracy the automatic extraction and processing of large volumes of raw and unstructured clinical data from electronic medical files.
Description
Despite the rapid development of medicine and computer science in recent years, the medical treatment in modern clinical practice is often empirical and based on retrospective data. With the growing number of patients and their concentration in large tertiary centers, it becomes attractive to systematically collect clinical data and apply them to risk stratification models. However, with the increasing volume of data, manual data collection and processing becomes a challenge, as this approach is time consuming and costly for the healthcare systems. In addition, unstructured information, such as clinical notes, are very often written as free text that is unsuitable for direct analysis. The use of artificial intelligence is very promising and is going to rapidly change the future of medicine in the upcoming years. Due to the automated processes it offers, it is possible to quickly and reliably extract data for further processing. The results from its use can be easily extended to different healthcare systems, amplifying the knowledge produced and improving diagnostic and therapeutic accuracy, and ultimately positively affecting health services. Collecting the vast amount of data from different sources without compromising patients' personal data is a major challenge in modern science.
Electronically-registered clinical notes of patients who were hospitalized in the Cardiology ward of tertiary hospitals will be retrospectively collected, as well as additional files such as the laboratory and imaging examinations related to each hospitalization. Given the size of the participating clinics and the years during which the recording of electronic hospital records in electronic form was applied, it is estimated that the sample of patient records will be about 60.000. All information that could potentially be used to identify a person, such as name, ID number, postal code, place of residence, occupation, will be deleted from these electronic files. Only the age will be recorded, not the exact date of birth of each patient. Only the days of hospitalization will be recorded and not the exact dates of admission and discharge from the hospital. Thus, the data will not be able to be assigned to a specific subject, as no additional information or identifiers will be collected for the subjects. After the files are anonymized, each patient's clinical note will be linked with a specific key ("identifier"). The electronic file that contains the correlation of the "identifier" with the patient's clinical note will be stored in a secure hospital electronic location. The fully anonymized files will initially be manually analyzed to extract information into a database containing all of patients' clinical information, such as discharge diagnoses, medications, treatment protocols, laboratory and diagnostic tests. At the same time, a sample (1/3) of the clinical notes will be analyzed to identify the keywords or phrases associated with each diagnosis (for example, the atrial fibrillation diagnosis will probably be recorded as "atrial fibrillation", " AF ", etc.). By using this generated dictionary of keywords and by integrating artificial intelligence methods and text mining, such as natural language processing (NLP), an automated extraction of data and diagnoses from these electronic medical notes will be attempted. The reliability and accuracy of the computational methods will be evaluated internally, comparing the data extracted automatically with those recorded manually. In addition, the reliability and accuracy of these computational methods will be evaluated externally, applying these methods to 2/3 of the clinical notes in which no association between keywords and specific diagnoses was attempted.
Regarding Greece, the present study aims to be the first to analyze the usefulness of artificial intelligence for automated extraction and processing of unstructured clinical data from patients' medical clinical notes. The results of this study will have a positive impact
- on
-
- the automation of large-scale data analysis and processing procedures
- the rapid epidemiological recording and utilization of clinical data
- the early diagnosis of diseases
- the development of phenotypic patient profiles that could benefit from targeted therapies
- the development of clinical decision support systems that will provide information about the possible clinical course of patients after hospital discharge and assist medical decisions
- the development and validation of prognostic models for major cardiovascular diseases
Eligibility
Inclusion Criteria:
- Hospitalised patients in Cardiology Departments in Greece
- Patients whose medical records are electronically stored in each hospital's computer/information systems
Exclusion Criteria:
- Patients that died during hospitalization, and thus no discharge letter was issued