How MS registries and data custodians can help

Are you a data custodian?
Find out how you can support the Global Data Sharing Initiative

 

How can registries join the Global Data Sharing Initiative?

 
Two online tutorials are provided to explain how data custodians can join the Global Data Sharing Initiative and share their data in the central platform:

  1. Introduction demo: “How registries can join the Global Data Sharing Initiative?
  2. Demo: “How import into QMENTA platform?

Step 1: DATA COLLECTION

We recommend all data custodians wanting to work on COVID-19 and MS should implement the COVID-19 core dataset within their protocols. This has been developed by a global data taskforce and is based on the International MuSC-19 Case Reporting platform and UK MS Register sub-study protocols.

A detailed dictionary describing the COVID-19 core dataset is provided:

If you are planning to implement the COVID-19 core dataset, please let us know by filling in this google document. This will allow us to put your initiative on our website (patient-reported initiatives, initiatives open to healthcare professionals). This way, we can redirect people with MS and healthcare professionals to your initiative.

 

Step 2: DATA SHARING

1) Sharing de-identified patient-level data into the central platform

To optimally support you and to reduce your time needed to be involved in the Global Data Sharing Initiative as much as possible, we installed a “data wrangling task force”. This data wrangling task force is a group of data scientists with extensive expertise in data management, pre-processing and harmonization. They will help you with all preparations to share your COVID-19 core dataset (more information below). Contact lotte.geys@uhasselt.be if you would like support of our data wrangling task force.

1.1 Ethical and legal restrictions of the data sharing procedure

We take ethics, privacy and security very seriously. Several measures are implemented to guarantee the pipeline and protocols covers ethical and legal restrictions.

  • A Master Data Transfer Agreement is available to cover your involvement within the initiative.
  • The protocol of this project has been approved by the ethical committee of UHasselt (Belgium)
  • A Disclosure Risk Assessment was performed by an independent third party (P-95)
  • The central platform (provided by QMENTA) is ISO certified to handle medical data. The system is therefore able to contain personal health information (PHI). Once data is uploaded into QMENTA, the data will not leave the QMENTA environment. The system is locked and data cannot be downloaded out of the system. QMENTA also has tracking of all user’s activity as well as fine grained permissions on an individual user level.
  • Access to the patient level data is restricted to the members of the task forces only. More specifically, the number of people accessing patient level data is restricted to 12 people (6 data wranglers – 6 data analysts). An agreement with the members of these task forces is currently in preparation.

Contact lotte.geys@uhasselt.be if you want to consult these measures in more detail (e.g. review the Data Transfer Agreement, consult the ethical approval or Disclosure Risk Assessment, …)

1.2 Prepare your de-identified export of your COVID-19 dataset

  • It is desired that your export is as identical to the defined data dictionary as possible. A transformation is most likely needed. An example of a COVID-19 dataset ready for import is provided here.
  • We encourage you to use a “transformation code” for this, because this will allow you to re-create the export regularly since we hope to update your data in the platform as regularly as possible (e.g. weekly, every two weeks, …).
  • A transformation code refers to a script (using any coding language you prefer e.g. R, SAS, Python, …) that transforms the format and structure of your dataset into the desired format and structure of the COVID-19 data export.

Our data wrangling task force is standby to support you with this step. Contact lotte.geys@uhasselt.be if you wish to be supported by our data wranglers.

1.3 Import your de-identified COVID-19 dataset into the central platform

The central platform, kindly provided by QMENTA, allows very fast and user-friendly imports. A manual is provided here. Next to this, an interactive demo is available here.
Our data wrangling task force is standby to support you with this step. Contact lotte.geys@uhasselt.be if you wish to be supported by our data wranglers.

2) Aggregated data sharing (federated pipeline)

If you are not able to share patient-level data, but would be willing to share aggregated data, you could run the federated Python script. The script assumes the data is harmonized locally to the COVID-19 core dataset as described in the dictionary. After running the scripts locally, the counts are shared and combined with the counts of the data inside the central platform. The advantage of this federated pipeline is that regulatory and privacy concerns are reduced. However, the main disadvantages of this approach are:

  1. We lose some of the ability to explore and fish in a fully combined dataset, so looking for patterns and forming hypotheses is more difficult.
  2. Also the running of localized data queries can become time consuming as the data analysis evolves.

Please contact Lotte Geys (lotte.geys@uhasselt.be) if you’re interested in becoming a “federated registry” sharing insights with the GDSI. Then, we will provide you with a docker, a manual for this option and a video. Additionally, we are happy to schedule a “live session” to run the script together and we can troubleshoot you all the way through.

Download pdf for more information on the Global Data Sharing Initiative.

Questions?  Please contact Lotte Geys:lotte.geys@uhasselt.be