Overview
This prospective cohort study aims to develop and validate a personalized disease risk prediction model for adults by integrating multiple sources of health data. The study will recruit community-dwelling adults aged 18 years and older in Taiwan. After providing informed consent, participants will complete a structured questionnaire, undergo pure tone hearing testing, and wear a smartwatch for 2 weeks to collect continuous physiological data, including heart rate and physical activity. With participant authorization, the study will also collect data from personal health records and national health insurance databases to allow longer-term follow-up of health outcomes.
The main goals of the study are to examine the relationships among hearing, lifestyle factors, and wearable device data; to identify combinations of risk factors associated with progression from health to subclinical or chronic disease states; and to develop analytical methods for integrating heterogeneous health data from questionnaires, physiological monitoring, hearing tests, and medical databases. Machine learning methods will be used to identify important predictors and build risk prediction models.
The study hypothesis is that combining hearing measures, lifestyle information, wearable physiological data, and longitudinal medical record data will improve the ability to identify individuals at higher risk of future disease compared with using a single source of information alone. The long-term objective is to support early risk identification, personalized health management, and prevention strategies in community adults.
Description
This study is a prospective cohort study designed to integrate multimodal health data for the development and validation of personalized disease risk prediction models in community-dwelling adults in Taiwan. The study focuses on combining actively collected research data, continuous wearable device data, hearing assessment results, and longitudinal health records to better understand the transition from health to subclinical states and chronic disease.
Participants aged 18 years and older will be recruited from community settings. After informed consent is obtained, study procedures will include a structured questionnaire, pure tone audiometry, and 2 weeks of smartwatch monitoring. The questionnaire will collect demographic characteristics, personal and family disease history, and lifestyle factors such as exercise, sleep, smoking, and alcohol use. Hearing function will be assessed using pure tone audiometry. Continuous physiological data collected from the wearable device will include heart rate and physical activity, such as step counts. Research staff will assist participants with device setup, application installation, and instructions for use to improve data completeness and consistency.
With participant authorization, the study will also obtain personal health record data and link study data with national health insurance databases to support longitudinal follow-up and ascertainment of disease outcomes. These linked data sources may include outpatient, inpatient, pharmacy, insurance enrollment, catastrophic illness, death registry, cancer registry, and adult preventive health examination records. The integration of these heterogeneous data sources is intended to provide a more complete picture of individual health trajectories and disease progression than can be achieved with any single data source alone.
The scientific objectives of the study are to:
- evaluate the associations among hearing status, lifestyle factors, and continuous wearable-derived physiological measures;
- identify combinations of key predictors associated with progression from health to subclinical or chronic disease states;
- establish an analytical framework for integrating heterogeneous data from questionnaires, hearing tests, wearable monitoring, and medical databases; and
- develop and validate high-accuracy personalized disease risk prediction models using statistical and machine learning methods.
Data processing will include data cleaning, handling of missing values, outlier checking, standardization, and cross-source data integration. Statistical analyses will include correlation and regression approaches to examine relationships among hearing, lifestyle, and wearable variables. Machine learning methods, including feature selection and supervised learning algorithms such as random forests and gradient boosting methods, will be used to identify important predictors and construct risk prediction models. Model performance will be evaluated using measures such as area under the receiver operating characteristic curve and accuracy. Internal and, where available, external validation strategies will be used to assess robustness and generalizability.
The central hypothesis of the study is that integrating hearing measures, lifestyle information, wearable physiological data, and longitudinal medical record data will improve disease risk prediction compared with models based on a single type of data alone. The long-term goal is to support early identification of high-risk individuals, facilitate personalized health management, and provide evidence for preventive health strategies in adult populations.
A strong emphasis will be placed on data privacy and confidentiality. All study data will be coded using unique study identifiers. The linkage file connecting study identifiers to personal identifiers will be stored separately with restricted access and encryption. After data collection, cleaning, and linkage procedures are completed and verified, the linkage file will be permanently destroyed so that subsequent analyses are conducted on de-identified data only. Any reports or publications resulting from the study will present aggregated findings without personally identifiable information.
Eligibility
Inclusion Criteria:
- Adults aged 18 years and older
- Living in the community in Taiwan
- Able to understand the study procedures and provide written informed consent
- Able and willing to complete the study questionnaire, hearing assessment, and wearable device monitoring procedures
- Has access to a smartphone and is able to install and use the study-related application, with assistance from study staff if needed
Exclusion Criteria:
- Diagnosis of dementia
- Too frail or has other health conditions that make participation in the study procedures not feasible
- Bilateral deafness without use of any hearing assistive device
- Does not have a smartphone or is unable to use a smartphone application required for the study procedures


