Hiring in life sciences? Share your open positions with our professional community. Read more Close

Advertisement

Medical data sharing and synthetic clinical data generation - maximizing biomedical resource utilization and minimizing participant re-identification risks.

Created on 17 Aug 2025

Authors

Simeone Marino, Ruth Cassidy, Joseph Nanni, Yuxuan Wang, Yipeng Liu, Mingyi Tang, Yuan Yuan, Toby Chen, Anik Sinha, Balaji Pandian, Ivo D Dinov, Michael L Burns

Published in

NPJ digital medicine. Volume 8. Issue 1. Pages 526. Aug 16, 2025. Epub Aug 16, 2025.

Abstract

The sensitive nature of electronic health records (EHR) and wearable data presents challenges in sharing biomedical resources while minimizing re-identification risks. This article introduces an end-to-end, titratable pipeline that generates privacy-preserving "digital twin" datasets from complex EHR and wearable-device records (Apple Watch data from 3029 participants) using DataSifter and Synthetic Data Vault (SDV) methods. Various obfuscation levels were applied (DataSifter: small, medium, large; SDV: CTGAN, Gaussian Copula) and benchmarked using utility (statistical fidelity, machine learning performance) and privacy (re-identification risk, detection likelihood) metrics. The highest-obfuscation DataSifter twin delivered the strongest privacy protection (0.83) while preserving key statistical and predictive signals (83.1% confidence interval overlap in regression models), outperforming SDV, particularly for longitudinal data. Despite declining performance in machine learning tasks with higher obfuscation, utility was generally preserved. The study underscores the importance of digital twin datasets and highlights DataSifter's adaptability in privacy-utility trade-offs, advocating its utility for secure data sharing.

PMID:
40818998
Bibliographic data and abstract were imported from PubMed on 17 Aug 2025.

Read full publication at:
Please sign in to see all details.

Advertisement

Stats

  • Community rating n/a 0 votes
  • Reviewers' rating n/a 0 votes
  • Your rating

1-terrible, 9-excellent. How would you rate this publication? Sign in in to submit your rating.

  • Recommendations n/a n/a positive of 0 vote(s)
  • Views 54
  • Comments 0

Recommended by

  • No recommendations yet.

Post a comment

You need to be signed in to post comments. You can sign in here.

Comments

There are no comments yet.

Advertisement