真實世界數據問題與多源異構數據治理實踐
doi:10.3969/J.ISSN.1672-7274.2025.06.031
中圖分類號:TP393.08 文獻標志碼:B 文章編碼:1672-7274(2025)06-0092-03
RealWorld Data Problems and Governance Practices of Multi-source Heterogeneous Data
WANGWeiyu,DIAO Jiaxing,WANG Hui (HarbinInstituteofInformationEngineering,Harbin15o431,China)
Abstract: The article focuses on hospital HIS data,deeply analyzes the complex process of cross hospital data governance,and systematically summarizes common real-world data problems and corresponding solutions.This articleaims to provide a practical research framework and ideas foraddressngdata governancechallenges in the realworld research field,promoting eficient utilizationof data resources and scientific translation ofresearch results.
Keywords: real worldresearch;multi-sourceheterogeneous data;HIS data; medicalbig data;data governance
1 真實世界數據特點
1.1準確性不足
盡管醫(yī)院信息系統(Hospital Information System,HIS)是數據采集的源頭,但各醫(yī)院均存在原始數據錄入缺失或錯誤的情況,本研究按照統一標準整理了醫(yī)院HIS系統常見字段信息共計156個: ① 基本信息,如性別、年齡、職業(yè)等; ② 出入院記錄,如入院時間、主訴、既往史、出院時間、診療經過等; ③ 中西醫(yī)診斷,如疾病名稱、疾病編碼、癥候名稱、癥候編碼等; ④ 生命體征,如檢查時間、檢查項、結果等; ⑤ 醫(yī)囑,如內容、時間、規(guī)格、頻率、數量等; ⑥ 檢驗結果,如檢驗項目、檢驗結果、單位、參考范圍等; ⑦ 檢查結果,如檢查部位、檢查描述、結論、時間等。(剩余4940字)