Frequently Asked Questions

What are the funding sources for the Epi25 genomic data generation? 

From 2016 - 2020, the genomic data generation was funded by a grant at the Broad Institute from the National Human Genome Research Institute (NHGRI) called the Centers for Common Disease Genomics (UM1HG008895).  Using these funds, we generated exomes for almost 30,000 patients with epilepsy!
From 2021 - 2023, the data generation has been funded by a grant from the National Institute of Neurological Disorders and Stroke (NINDS), R01NS106104. Plans for 2024 and beyond are under discussion. 

Is there funding support for preparing and shipping samples and entering phenotype data into the REDCap database? 

At this time, Epi25 is not able to offer financial support to contributing sites for sample preparation, shipping, or phenotyping.

Do you have an Epi25 consent form I can use to consent participants?

To date, Epi25 has aggregated DNA samples collected under previously approved IRB protocols at the sites’ home institutions. Therefore, we do not have a specific Epi25 consent form to share. If you are unsure if your participants have provided sufficient informed consent in order to be included in Epi25 sequencing and data sharing, please reach out to Felecia.

What is the NIH Genomic Data Sharing Policy and why does it matter in Epi25?

Genomic data generation for Epi25 has been funded by the US government (National Human Genome Research Institute, and National Institute of Neurological Disorders and Stroke, which are institutes of the National Institutes of Health [NIH]). The NIH GDS Policy applies to all NIH-funded research (e.g., grants, contracts, and intramural research) that generates large-scale human or non-human genomic data, regardless of the funding level, as well as the use of these data for subsequent research. Large-scale data include genome-wide association studies (GWAS), single nucleotide polymorphisms (SNP) arrays, and genome sequence, transcriptomic, epigenomic, and gene expression data.  Because Epi25 genomic data generation has been funded by the NIH, the Epi25 genomic data is deposited into a NIH-sponsored controlled access repository such as dbGaP and AnVIL. Therefore, the NIH Genomic Data Sharing Policy applies to Epi25 genomic data. 

In brief, the NIH Genomic Data Sharing Policy requires that - for participants consented after the policy effective date of January 25, 2015 - participants provided informed consent for broad sharing of samples/data, future use of samples/data, and deposition of data into a repository. This means that the consent forms used to recruit participants should have explicit language regarding these points.

If your site has samples collected after January 25, 2015, and you would like to contribute them to Epi25, it is required that your consent forms used to collect these samples contain this language. 

If your site has samples collected before January 25, 2015, the consent form must not be inconsistent with broad sharing, future use, and deposition of data into a repository. 

An FAQ about the NIH Genomic Data Sharing Policy can be found here.

Examples and considerations for consent form language that meets these requirements can be found here.

What is a Data Use Limitations letter? If I can provide an IRB approval letter that dbGaP data deposition is allowed, why do you require a Data Use Limitations letter?

Because the Broad is using Epi25 samples in a secondary use manner and did not approve the sample collection activities themselves, we require Data Use Limitations letters to indicate that A) sharing samples and data with the Broad is allowed B) sharing data in a controlled-access repository is allowed and C) if there are any use restrictions associated with the data.  The Broad uses the Data Use Limitations letters to complete the Institutional Certification form that is required to register samples/data in an NIH-designated controlled-access repository such as dbGaP or AnVIL.  

What if my ethics committee will not sign the Data Use Limitations letters?

Contact the Broad Project Manager to discuss options for signatories in the letter.

Will the Broad accept more or less DNA than the recommended 50 ul at 80 ng/ul?

The requested amount of DNA is 50 ng/ul at 50 ul.  This amount of DNA assures that there will be enough material for any downstream analyses. If you have samples that cannot meet these requirements, we still may be able to accept the samples. Contact Felecia with any questions about the sample input requirements. 

What about family samples? Can those be sequenced?

At this time, Epi25 is sequencing one affected case per family because we are pursuing case-control analyses. Unaffected or affected family members are not eligible for Epi25.

Is there a process to apply for and use the Epi25 data from all sites? 

Members of Epi25 sites can apply for data access through the collaborative’s proposal form. The data access process for Epi25 is described on our “Data Application” page. Please also be sure to review the Genomic Data Sharing SOP.  

We have set up Google buckets to share data with the analysts from teams who have approval from the Epi25 Strategy Committee and membership after review of  their research proposals. Because these data are often released to analysts as early access and additional IRB or legal approvals would be needed at the team’s site to download the data, individual-level genomic data  are not to be downloaded from the Google buckets and analysis should be done in the cloud environment.

Please note that the data available in the Google buckets are from Epi25 sites ONLY. Any non-Epi25 control cohorts (e.g. those used in the primary publications) are not available via the Epi25 data application.

Summary results from our primary WES analyses can be found on the Epi25 Results Browser.

What genomic data will I receive from the samples that my site contributes?

For samples that are whole exome sequenced, you will receive a variant called format (VCF) file and crams for the samples that you contributed. The crams are large; please be prepared for very large data files to be transferred and stored at your institution. As of Year 4 of the project (2020), these data (VCFs and crams) are being stored in the AnVIL hosted by Terra Workspaces and aligned to hg38.  You will need a Terra account associated with a Google ID in order to access  your data. Please review this link for instructions on how to create an account and this link for options to download the data.  Please note that there are egress charges associated with downloading the data from AnVIL Workspaces, so please be sure to arrange for a permanent local storage solution if you intend to download your data. 

We  are also genotyping Epi25 samples on a whole genome genotyping array.  For Years 1-5, we used the Illumina Infinium Global Screening Array with multi-disease content, version 1.0(GSA-MD v1.0).  Starting in  Year 6, we are using v3 of this array. These data (VCF file, idats, gtcs) can be made available to you via a Terra Workspace.  

Please note that in the majority of cases, all samples that are selected by Epi25 for genomic data generation are typically submitted for both whole genome genotyping and whole exome sequenced.  However, if samples have insufficient amounts of DNA, we may not proceed with genotyping or sequencing the sample or sometimes only one data type is generated.

How can I find out more about the methods used to generate the genomic data? 

This depends on the type of data (WES or GSA) and when it was generated. In general: 
Years 1-3 WES: Illumina TruSeq or Nextera bait, Illumina HiSeqX. The information about these kits is on Illumina’s website: 

Years 4-onward  WES: Twist Biosciences bait, Illumina NovaSeq. For reference files, please reach out to the Broad Project Manager. 

For information on how array or WES VCF files were generated, please reach out to the Broad Project Manager.

When will I receive the WES and genotyping data for the samples I contribute?

Data are generated on a rolling basis but are released at the close of the funding cycle.  Typically, that means that genomic data for a funding year are ready to be returned to sites at the start of the next calendar year.  

Please note that it is a requirement of participation that your site contribute detailed phenotype data (using the Epi25 clinical data forms) before receiving genomic data. Genomic data can be released to your team once the Epi25 Phenotyping team have received and reviewed the associated phenotype data.

Is Epi25 providing a summary of the results for the patients in my cohort? Can I return the genetic results back to participants?

No, Epi25 is not providing a high-level summary of the results from each participant.  The genomic data generated as part of Epi25 are not meant to be used in a clinical setting. The lab protocols used to generate these data are for research purposes only; they are not CLIA-certified. Therefore, it is not recommended that these results are returned to participants.  

I am an investigator from the EU. How can I send DNA samples and phenotype data to the US for Epi25 while complying with GDPR?

GDPR stands for the General Data Protection Regulation which went into effect on May 25, 2018.  GDPR is complex and each EU member state is still developing policies and procedures to comply with the new law.  However, there are now stricter rules governing the transfer, use, and storage of “personal data” which includes DNA and health information.  This means that you may need to enter in a specific agreement with the Broad and/or University of Luxembourg to transfer your DNA samples (to Broad) and phenotype data (to University of Luxembourg).  The contacts at each site will work with you to determine the type of agreement needed.

What are the Centers for Common Disease Genomics and how are they related to Epi25?

The Centers for Common Disease Genomics program is part of the NHGRI Genomic Sequencing Program. A CCDG grant at Broad funded genomic data generation for Epi25 from early 2016 through 2020. You can learn more about the CCDG and other funded sites here and here

Epi25 genomic data is also made available to CCDG members very soon after it is generated.  CCDG members have to apply to use the data, and after approval, the data will be shared in line with data use restrictions indicated on the Data Use Limitations letters.

What is AnVIL? 

The AnViL is a newly developed data storage and analysis platform funded by the NHGRI. AnVIL is powered by Terra, which means that AnVIL looks a lot like Terra.  Epi25 genomic data funded by the NHGRI CCDG will be staged in the AnVIL for access by sample providers as well as investigators who apply for data access via dbGaP.  More information can be found here: https://anvilproject.org/

I would like to publish a manuscript using genomic data from the samples I contributed and/or data from other Epi25 sites. Where can I find more information about the publication guidelines for Epi25? 

Please review the guidelines. If you have questions, please reach out to the Publications Committee via Olivia Hoeper

Can you help me? I am not sure who to contact. 

Patient eligibility and phenotype data: Olivia Hoeper

Genomic data, consent and data sharing permissions, and DNA sample submission: Felecia Cerrato

Phenotype data transfer, agreements and REDCap access: Roland Krause