Skip to content

Dataset Sharing Plan

A Dataset Sharing Plan (DataDSP) is a structured framework used to document the sharing, storage, and usage details of datasets within MC2 Center-supported Synapse projects. These plans help ensure datasets are traceable, well-organized, and compliant with regulations regarding data sharing, accessibility, and ethical use. By defining attributes such as dataset names, assays, species, and sharing permissions, the model facilitates efficient data management and collaboration across research projects.

This page outlines the key attributes required to create a data sharing plan, guiding users on how to structure and share datasets while adhering to best practices. It also demonstrates how attributes like planned upload dates, file formats, and grant numbers are used to ensure compliance with both internal and external data-sharing policies.

Why You Should Contribute DataDSP Entries

Contributing DataDSP entries benefits your research and projects by improving data accessibility, organization, and compliance. With complete entries, you enhance collaboration opportunities, simplify data sharing, and increase the impact and visibility of your datasets in research communities. By ensuring your data is properly documented and discoverable, you also reduce administrative burdens during audits, grant reporting, and data requests.

Who Should Be Contributing DataDSP Entries?

  1. Principal Investigators (PIs) – Gain recognition for your research by making your data easily accessible and well-documented, improving citation potential and collaboration opportunities.
  2. Data Managers – Ensure efficient data organization and compliance, minimizing time spent addressing data queries or audits.
  3. Research Coordinators – Streamline project workflows by contributing accurate metadata, reducing delays in data sharing and project reporting.
  4. Consortium Members – Enhance collaboration by contributing standardized data entries, ensuring that datasets are usable across multiple institutions and research projects.

Download Template

Download the DataDSP CSV template for streamlined data entry, ensuring that all required fields are filled out.

Example Data Entry

The table below includes sample values to demonstrate proper attribute usage.

Attribute Example Value
DSP Dataset Name DSP_Dataset_Lung_Research_2021
DSP Dataset Alias Syn123456
DSP Dataset Assay 3D Bioprinting
DSP Dataset Species Asian Elephant
DSP Dataset File Formats CSV, JSON
DSP Planned Upload Date 2022-12-01
DSP Dataset Grant Number CA209971
DSP Dataset Description "A quick description of your data sharing plan."
DSP Dataset Destination "/home/user/datasets/dsp_output"
DSP Dataset Url https://www.example.com/dataset/dsp1234

Full Field Reference

Below is the full field reference table with attributes and their descriptions.

Attribute Description Required Validation Rules Examples
DataDSP Dataset sharing plan information. Used to indicate planned usage of MC2 Center supported Synapse projects. False None nan
DSP Dataset Name Name of the dataset True str DSP_Dataset_Lung_Research_2021
DSP Dataset Alias Alias of the dataset. For Synapse storage, the Synapse id associated with the storage folder should be used. Must be unique. False unique nan
DSP Dataset Assay The type of data contained in this group of files. Multiple values permitted, comma separated. True list like 3D Bioprinting
DSP Dataset Species The species the data was collected on. Multiple values permitted, comma separated. True list like Asian Elephant
DSP Dataset Tumor Type The tumor type(s), if applicable, of the data collected on. Multiple values permitted, comma separated. False list like Lung Carcinoma
DSP Dataset Tissue Tissue type(s) associated with the dataset. Multiple values permitted, comma separated. False list like Brain
DSP Dataset File Formats A list of file formats associated with the dataset. Multiple values permitted, comma separated. False list like CSV
DataDSP_id A unique primary key that enables record updates using schematic. True unique CA261717-DSP-1
DSP Dataset Level The level of processing associated with the dataset. False list like Level 3
DSP Number of Files The number of files that will be uploaded as part of this dataset. False num 25
DSP Number of Samples The number of biospecimens associated with the files included in the dataset False num 50000
DSP Number of Participants The number of individuals or model organisms from which samples were collected to generate the dataset. False num 12
DSP Storage Size The expected total storage space required for the dataset in gigabytes (GB) False num 16 GB
DSP Planned Upload Date A non-binding, estimated date by which the files are expected to be uploaded to a repository. True date 2022-12-01
DSP Planned Release Date The projected date that marks the end of any requested or required sharing embargo for the dataset. False date 01/25/2023
DSP Dataset Grant Number Grant number(s) associated with the dataset's development. Multiple values permitted, comma separated. True list like CA209971
DSP Dataset Url The url where the dataset is or will be stored. For Synapse storage, the Synapse url or doi associated with the dataset storage folder should be used. False url nan
DSP Data Use Codes DUO code - A data item that is used to indicate consent permissions for datasets and/or materials, and relates to the purposes for which datasets and/or material might be removed, stored or used. Available DUO code definitions can be found here: https://mc2-center.github.io/data-models/valid_values/sharingPlans/#attribute-dsp-data-use-codes False list like If a research institution is using certain data for health and medicine-based studies, then their DSP Data Use Codes could be "HMB".
DSP IRB Form The Synapse Id for the executed IRB protocol or exemption document that was uploaded to Synapse. Required if human-derived data was generated for this study and will be uploaded as part of this dataset False regex match syn\d+ DSP IRB Form V2.0, 2020
DSP Dataset Description A text description of the files contained in this dataset. False str A quick description of your data sharing plan.
DSP Dataset Destination An identifier representing the repository in which this dataset is intended to be held for long term preservation. False str Synapse
DSP Dataset Metadata The link(s) corresponding to metadata templates that should be used to record information about this data. This field will be populated by the MC2 Center after plan submission. False list like nan