Overview
This registry prospectively collects anonymized free-text symptom descriptions submitted voluntarily by adults through the OpenGenome platform at opengenome.bio. For each submission, the system retrieves real biomedical literature from PubMed and ClinicalTrials.gov in parallel, applies a constrained reasoning model operating under a strict output schema, and returns a structured biological signal report. The study evaluates the internal consistency of extracted signals, the calibration of confidence scores relative to dataset size and symptom specificity, and the distribution of biological signal categories across a large anonymous population. No intervention is assigned. No participant contact occurs. All data is anonymized at the point of collection.
Description
OpenGenome is a publicly accessible, anonymous research instrument that maps free-text symptom descriptions to structured biological signals grounded in primary biomedical literature. Upon submission, the platform dispatches parallel queries to PubMed via NCBI E-utilities and ClinicalTrials.gov v2 API, retrieving up to 16 real sources per submission. A reasoning model constrained by a strict schema extracts a primary biological signal, up to five secondary signals, a plain-language correlation explanation, a confidence score, and a signal strength score. All scores are integers on a 0 to 100 scale. Sources are included by PMID or NCT identifier and are directly linkable for independent verification. This registry will analyze aggregate anonymized outputs to characterize signal consistency, score calibration, and population-level signal distributions.
Eligibility
- Automated or programmatically generated submissions detected by rate limiting
- Submissions containing no discernible symptom or health-related content


