Skip to Main Content


Skip Nav Destination

Data Deposition

Nucleic acid and protein sequences, microarray data, and structural data must be deposited in a public database and must be available as of the date of publication. Microarray data should be MIAME compliant. For molecular models, the peptide backbone coordinates plus the coordinates of important side chains emphasized in the text should be made available in the Protein Data Bank (PDB). Relevant accession numbers must be included in the manuscript text. When possible, authors also should deposit plasmid constructs, genetically modified organisms, and the like in an appropriate repository.

Genome-wide gene expression data

Microarray data and gene expression data should be MIAME compliant and deposited into the ArrayExpress, GEO, or DDBJ databases. Please include accession numbers in the manuscript.

Genetic data

For genetically modified model system lines, we encourage deposition of genetic data in databases such as FlyBase, WormBase, and the Saccharomyces Genome Database, as appropriate. In such cases, please include database IDs in the manuscript.

Nucleotide sequence data should be deposited in an International Nucleotide Sequence Database Collaboration member such as GenBank, the European Nucleotide Archive (ENA), or the DNA Data Bank of Japan (DDBJ). It is also possible to deposit sequence read data from high-throughput sequencing studies in repositories such as the Sequence Read Archive (SRA) from NCBI or ENA’s sequence read archive.

We encourage authors to follow the guidelines described in the Human Variome Project for interpretation and sharing of human genetic variation data. Pending compliance with IRB protocols, we recommend deposition of human genetic variants and polymorphisms found in control samples in dbSNP, the Database of Genomic Variants Archive (DGVa), or the Database of Genomic Structural Variation (dbVAR). We encourage our authors to deposit human genetic data found in non-healthy samples in gene/disease-specific databases and use locus-specific mutation databases. If compliant with ethical obligations to the patients and relevant medical and legal issues and if compatible with the IRB protocols and consent protocols in place, genotype and clinical data should be deposited in one of the major public access-controlled repositories such as dbGAP or EGA.

Proteomics, structural, and metabolomics data

For molecular models, the peptide backbone coordinates plus the coordinates of important side chains emphasized in the text should be made available in a worldwide Protein Data Bank (PDB) member. We encourage authors to deposit structural data of biological macromolecules obtained by 3D electron mcroscopy in the EMDataBank or the Protein Data Bank in Europe.

Protein sequence and functional information can be deposited in UniProt.

Nucleic acid structural information should be submitted to the Nucleic Acid Database (NDB). For NMR structures, data deposited should include resonance assignments, all restraints used in structure determination (NOEs, spin-spin coupling constants, amide exchange rates, etc.), and the derived atomic coordinates for both an individual structure and for a family of acceptable structures.

Mass spectrometry data should be deposited in a machine-readable format such as mzML in a public database such as Pride or PeptideAtlas. We recommend that authors follow the MIAPE recommendations.

Protein interaction data should be deposited with a member of the International Molecular Exchange Consortium (IMEx) prior to submission of the manuscript. Authors should follow the MIMIx recommendations.

Chemical compound screening, substance information, compound structures, and assay data may be deposited in NCBI’s PubChem.

Metabolomics data should be deposited following the recommendations of the Metabolomics Standards Initiative (MSI) in a recognized repository such as MetaboLights.

Computational models

Please provide computational models in a machine-readable form at the time of first submission to allow reviewers to examine the analysis and simulations performed in the study. When possible, standardized formats (SBML or CellML) should be used instead of scripts (e.g., MATLAB). Authors are encouraged to follow the MIRIAM guidelines and deposit their models in a public database such as BioModels or JWS Online.

Close Modal

or Create an Account

Close Modal
Close Modal