Real-world data (RWD) analysis

一行要約

Real-world data analysis は EHR / claims / cancer registry / patient-reported outcome 等の non-trial 由来データを解析し、treatment effectiveness, sequencing, comparative outcome を導出する手法。RCT が enrollment 不足 / 倫理的に困難な領域 (rare driver, elderly, multi-line sequencing, post-marketing safety) で causal inference framework (target trial emulation, propensity score, IPTW) を活用して real-world evidence (RWE) を生成。FDA / PMDA も regulatory submission に RWE を accept するようになり、Flatiron-FDA pilot で post-marketing supplement に適用例が増加。

原理

データソース

ソース特徴
EHR (Flatiron, ConcertAI)structured + NLP-derived clinical data, treatment / outcome 詳細
Claims (Medicare, MDV, JMDC)wide population, 費用 / utilization データ豊富、臨床精度限定
Registry (SEER, NCDB, CGTNet)population-based, long-term outcome、driver mutation 情報限定
Hybrid (AACR GENIE)NGS panel + clinical outcome リンク、large multi-center
PRO (PRO-CTCAE, EQ-5D)patient symptom / QOL longitudinal

因果推論手法

  • Target trial emulation: 仮想 RCT を設計 → RWD で emulation、Hernán framework で immortal time bias 等を回避
  • Propensity score matching / IPTW: 治療群間の baseline imbalance を補正
  • Difference-in-differences: policy / approval 変更前後比較
  • Instrumental variable: physician preference / regional variation を IV
  • G-methods (g-formula, MSM, g-estimation): time-varying confounding 補正
  • Negative control: outcome / exposure negative control で unmeasured confounding 推定

主要エビデンス / 適用領域

  • NSCLC IO post-approval: KEYNOTE-024 後の real-world OS, 一次 chemo + IO sequencing の comparative effectiveness
  • Rare driver (ROS1, RET, NTRK): 試験 sample size 不足を RWD で補完、approval 後の outcome
  • Elderly NSCLC: ≥75 歳の RCT under-representation を RWD で補完
  • Treatment sequencing: osimertinib 1L → 耐性後 sequencing pattern と outcome
  • Regulatory RWE: FDA Project Pragmatica, PMDA RWD pilot で post-marketing supplement に活用
  • Health equity / disparity: race / ethnicity / SES と outcome 関連 (試験 cohort では biased)

適用分野と限界

  • 強み: large sample, generalizability, long-term outcome, rare disease 適用、external trial cohort 構築
  • 限界: measured / unmeasured confounding, missing data (PS / outcome), data quality variability (EHR vs claims), driver / biomarker 情報の completeness, regulatory acceptance は contextual (post-marketing OK / first approval は限定的)

Open Questions

  • AI-assisted EHR phenotyping: LLM-based abstraction の accuracy / regulatory pathway
  • Multi-source linkage: claims + EHR + genomic + PRO の harmonization
  • Synthetic control arm: RCT の control arm を RWD で代替する regulatory framework
  • Causal inference reproducibility: target trial emulation の standardization

関連エンティティ