Overview
the investigators's study group has developed a fully automated 3D convolutional neural network (CNN)-based diagnostic framework using information of appendix (IA) model to identify non-appendicitis and simple and complicated appendicitis on CT scan images based on the two-stage binary classification algorithm, as a clinician does for deciding treatment. The dataset was built from a large population of patients visiting emergency departments who underwent intravenous contrast-enhanced abdominopelvic CT examinations to evaluate abdominal pain in the right or lower quadrant area as the chief complaint. Recently, the IA model was externally validated using a dataset of multicenter institutions through data exfiltration. In this study, the investigators hypothesized that the IA model would show a comparable negative appendicitis rate of <10% non-inferior margins compared to non-radiologists with a shorter interpretation time in a prospectively randomized dataset.
Description
Development of information of appendix model
- This study used a pretrained information of appendix (IA) model based on a fully automated diagnostic framework to predict three classes with probability and feature mapping: non-appendicitis, simple appendicitis, and complicated appendicitis.
The pipeline of IA model embedding parameters learned from the 3D CT image of 7,147 patients consisted of a two-stage binary algorithm connected to transfer learning with three 3D CNN models: DenseNet, EfficientNet, and ResNet.
- A two-stage binary algorithm for transfer learning was applied to learn inherent patterns from the 3D images of three true classes, as way a clinician does when deciding treatment for appendicitis. In the first step of the pipeline, a Stage 1 classification model was developed to identify non-appendicitis vs. appendicitis. In the second step of the pipeline, the Stage 2 classification model identified simple vs. complicated appendicitis from the data transferred with the trainable parameters learned in Stage 1.
Currently, the final IA model has been externally validated using a never-before-seen dataset from an outside institution with broad eligibility criteria for patients who visited the emergency room with abdominal pain and underwent abdominopelvic CT due to clinical suspicion of acute appendicitis.
Intervention This is a comparative study between human and IA models for the negative appendectomy rate (NAR) in non-, simple, and complicated appendicitis of a prospective randomized dataset, which was collected from CT scans of patients who visited the emergency room with acute abdominal pain. The participants in the control arm were ten non-radiologists.
The IA model is not allowed to impede the preoperative decision process of clinicians in ongoing clinical practice regarding the legal and ethical aspects of interrogating the diagnostic frameworks of the IA model on the mathematical reasoning for an outcome in terms of protecting human subjects.
Considering the minimal risk to human subjects and the ethical and legal aspects of artificial intelligence, the study design strictly allocates CT images to non-radiologists and IA models after completion of treatment for appendicitis with or without pathologic confirmation.
Data allocation and performance evaluation
- To compare the IA model to humans and non-radiologists equally, the interpretation condition for a non-radiologist should meet all criteria as follows: Only CT images without such variables as age, sex, and laboratory values should be randomly assigned; affiliation of both the non-radiologist and patient of the CT image should not be the same; the allocation ratio for non-appendicitis, simple, and complicated appendicitis assigned to the non-radiologist should be blinded; and the range of the CT image for interpretation should be identical to the range of the ROI for the appendix, which is generated by automatic localization of the IA model. To evaluate the diagnostic performance in humans, four test items were set up by a non-radiologist as follows: (1) visualization of the appendix; (2) exclusion of appendicitis; (3) complications; and (4) diagnosis of CT images.
The primary and secondary outcomes were assessed using the results of the four questions for the participants and the diagnostic performance of the IA model in the same dataset.
Ethical statement According to the "Guide to Utilizing Healthcare Data" and "Guide to Handling Pseudonym Information" of the Personal Information Protection Commission of the Ministry of Health and Welfare, data exfiltration with security assurance of the data archive was conducted with the approval of the Data Review Board at participating institutions. The central institutional review board (IRB) and IRBs of all participating institutions approved this study.
Sample size and statistical methods The investigators's prior study, which completed external validation of the IA model through data exfiltration of external institutions, showed an NAR of 12.4% in the external dataset. The standard reference value for non-radiologists for NAR has been reported as 10.5% ± 5%. Therefore, the acceptable difference in NAR between the IA model and non-radiologists for clinical application should be less than 10.0% of the non-inferiority margin.
The sample size for the randomized controlled dataset for a single institution was 263 assuming a 95% confidence interval, statistical power of 90%, two-tails sided α = 0.05, significance level = 0.05, and effect size = 0.2 at the Z-test. Two external institutions participated in this study. Due to data heterogeneity, normalization, and prevalence differences across regions between the two external institutions, the sample size was calculated for each institution. Therefore, assuming 263 patients per institution with a follow-up loss of 10%, the total sample size was 568.
Eligibility
Inclusion Criteria:
Inclusion criteria for broad eligibility were applied to reflect that the CT utilization rate in the emergency room has rapidly expanded, presumably in many other institutions where physicians maintain a reasonably sensitive standpoint in raising a clinical suspicion of appendicitis as a cause of abdominal pain and then use CT as an imaging test to confirm or rule out appendicitis. When the imaging protocol parameters were as follows: abdomen or pelvis (intravenous contrast, 2 mg/kg, maximum 160 mL), scan timing (portal venous phase), range (from 4 cm above the liver dome to 1 cm below the ischial tuberosity), radiation dose (tube potential, KVP from 100 to 120), pitch 1.75:1, and reconstruction (5 mm, cut slice for adults; 3 mm, cut slice for children under 12 years old), anonymized CT images of patients were referred to a randomized dataset. Exclusion Criteria: Patients who did not fulfill the CT imaging protocol were excluded in detail as follows: i) Failure to meet the CT protocol criteria of this study: liver CT, biliary CT, etc. (if contrast phase was different); ureter CT, etc. (if contrast media was not used and the reconstruction method was different); non-enhanced CT (when contrast media was not used); and appendix CT or low-dose CT (when radiation dose was low). ii) When the quality of the CT image is significantly reduced, as follows: when blurring occurs (motion artifact) or metal artifact (when internal fixation is performed due to spinal surgery). iii) when it was evident from the medical record review that clinical information suggested that APCT was performed due to the suspicion of a condition other than appendicitis, as follows: suspected acute cholecystitis due to RUQ tenderness and Murphy's sign; suspected urolithiasis due to flank pain and gross hematuria; suspected pancreatitis due to a history of pancreatitis; alcohol abuse; and suspected gynecological diseases due to vaginal discharge. Suspected panperitonitis due to whole abdominal tenderness, rebound tenderness, and unstable vital signs. Patients with acute cholecystitis, ureteral stones, pancreatitis, or acute peritonitis due to small bowel or colon perforation were also excluded. iv) Patients younger than 10 years were excluded. Adolescent patients from 11 to 18 years old were included in the study if the exclusion criteria were not applicable. v) diagnosed by ultrasound sonography vi) Patients who were transferred to the emergency department after a diagnosis of appendicitis at an outside hospital or ambulatory care were excluded. vii) Patients with appendicitis who did not undergo surgical treatment because of the enrollment protocol of other ongoing studies. viii) patients who had undergone an appendectomy