A selection of online data repositories, grouped by subject areas.
Biochemistry of the Human Body repositories
- Electronic Microscopy Databank (EMDB). A repository for electronic microscopy density maps of macromolecular complexes and subcellular structures.
- Human Metabolome Database. Contains detailed information about small molecule metabolites found in the human body.
- MetaboLights. A database for metabolomics experiments and derived information.
Biomedicine and Health repositories
- Adverse Event Reporting System. A database, compiled by the U.S. Food and Drug Administration (FDA), containing information on adverse event and medication error reports submitted to FDA.
- Broad Institute Cancer Datasets. Cancer program datasets from the Broad Institute, MIT.
- CDC FluView. Weekly U.S. Influenza Surveillance Report
- CDC National Center for Health Statistics. This page compiles key sources of research data and related information from the NCHS website.
- ClinicalStudy DataRequest. This site provides access to data from listed clinical trials. Researchers can use it to request access to anonymized patient level data and/or supporting documents to conduct further research.
- Closer Discovery. An online resource that enables researchers to view and appraise data from eight leading UK longitudinal studies.
- C-PATH Online Data Repository (CODR). Allows researchers to upload and work on data relevant to biomarkers of drug toxicity, neurodegenerative diseases, and patient-reported outcomes.
- freeBIRD. A website set up to all the uploading and sharing of injury and emergency research data.
- HCUP Databases. A collection of longitudinal hospital care data from the Healthcare Cost and Utilization Project (HCUP), USA.
- HealthData.gov. A site that provides access to health data from the U.S. government.
- Influenza Research Database. Contains avian and non-human mammalian influenza surveillance data, human clinical data associated with virus extracts, phenotypic characteristics of viruses isolated from extracts, and all genomic and proteomic data available in public repositories for influenza viruses.
- National Addiction & HIV Data Archive Program. A US repository of data relevant to drug addiction and HIV research.
- National Cancer Institute Surveillance Research Program. The Program provides surveillance and research data, statistical reports, and analytical tools on cancer.
- NDAR - National Database for Autism Research. An NIH-funded research data repository that aims to accelerate progress in autism spectrum disorders (ASD)
- NDCT - National Database for Clinical Trials Related to Mental Illness. A platform for the sharing of human subjects data from all clinical trials funded by the National Institute of Mental Health (NIMH), USA.
- NIH Data Sharing Repositories. A list of NIH-supported data repositories that make data accessible for reuse.
- OpenfMRI. Based at Stanford University, this repository allows for deposit and sharing of complete raw fMRI datasets.
- PhysioNet. Free access to collections of recorded physiologic signals and related open-source software.
Genetics
- ArrayExpressArchive. Stores data from high-throughput functional genomics experiments.
- COSMIC (Catalogue of Somatic Mutations in Cancer). Stores and displays somatic mutation information and related details and contains information relating to human cancers.
- Database of Genomic Variants (DGV). A catalogue of structural variation (SV) found in the genomes of control individuals from worldwide populations.
- dbSNP: the NCBI Database of Genetic Variation. An archive for genetic variation within and across species.
- dbVAR. NCBI's database of genomic structural variation.
- Database of Genotypes and Phenotypes (dbGAP). Developed to archive and distribute the results of studies that have investigated the interaction of genotype and phenotype.
- EBI Metagenomics. Enables analysis and archiving of metagenomic data.
- European Genome-Phenome Archive (EGA). A repository for all types of sequence and genotype experiments.
- European Variation Archive (EVA). A database of all types of genetic variation data from all species.
- Gene Expression Omnibus. A genomics data repository supporting MIAME-compliant data submissions.
- Genome RNAI. A database containing phenotypes from RNA interference (RNAi) screens in Drosophila and Homo sapiens.
- GPMDB - Global Proteome Machine Database. A database of proteomics experimental information and data.
- mRIBase. A searchable database of published miRNA sequences and annotation.
- NCBI - SRA: Sequence Read Archive. A database which stores sequence data obtained from next-generation sequence technology.
- PRIDE Archive - Proteomics Data Repository. A repository of mass spectrometry-derived proteomics data.
Global Health repositories
- CIRI Human Rights Data Project. Contains standards-based quantitative information on government respect for 15 internationally recognized human rights for 202 countries, annually from 1981-2011.
- The Demographic and Health Surveys (DHS) Program. A collection of data on population, health, HIV and nutrition through more than 300 surveys in over 90 countries.
- EM-DAT: The International Disaster Database. Standardised data sets compiled by the Centre for Research on the Epidmiology of Disasters - CRED.
- The IATI (International Aid Transparency Initiative) Register. This registry provides links to all raw data published using the IATI xml standard.
- INDEPTH (International Network for the Demographic Evaluation of Populations and Their Health) Data Repository. INDEPTH is a global network of research centres that conduct longitudinal health and demographic evaluation of populations in low- and middle-income countries (LMICs). The repository aims to make data from these evaluations available to data users.
- OECD (Organization for Economic Co-operation and Development) Data. Data relating to a variety of subject areas, including health.
- UNdata. The United Nations Statistical Division (UNSD) data service brings UN statistical databases within easy reach of users through a single entry point.
- UNICEF Data: Monitoring the Situation of Children and Women. UNICEF maintains several databases for tracking the situation of children and women globally. The databases include only statistically sound, nationally representative data from household surveys and other sources. They are updated annually.
- UNICEF: MICS (Multiple Indicator Cluster Surveys). Datasets from these household surveys can be accessed.
- World Bank Data. Statistics on development in countries around the globe, collated by the World Bank. Consists of a number of datasets including development indicators, debt statistics and trade logistics statistics.
- World Health Organization Health Data and Statistics. This site includes access to the Global Health Observatory (GHO); Global Health Estimates (GHE) and the WHO Mortality Database.
Medical Image repositories
- Cancer Imaging Archive. Contains medical images of cancer available for public download.
- SICAS (Swiss Institute for Computer Assisted Surgery) Medical Image Repository - a Virtual Skeleton Database. A collection of medical images.
- Wellcome Images. A collection of images with themes ranging from medical and social history to contemporary healthcare and biomedical science.
Neurosciences repositories
- neurosynth.org. A platform for large-scale, automated synthesis of functional magnetic resonance imaging (fMRI) data.
- brainmap.org. A database of published functional and structural neuroimaging experiments with coordinate-based results (x,y,z) in Talairach or MNI space.
- PRO-ACT (Pooled Resource Open-Access ALS Clinical Trials Database). A large ALS (amyotrophic lateral sclerosis) clinical trials dataset.
Proteins, nucleoproteins, neucleotides, nucleic acids and peptides repositories
- Biological General Repository for Interaction Datasets (BioGRID). An archive of genetic and protein interaction data from model organisms and humans.
- Biological Magnetic Resonance Data Bank. A repository for data from NMR spectroscopy on proteins, peptides, nucleic acids and other biomolecules.
- DNA Databank of Japan (DDBJ). Provides nucleotide data and a supercomputer system to support researchers in life science.
- Database of Interacting Proteins (DIP). This database archives and evaluates experimentally determined interactions between proteins.
- European Nucleotide Archive (ENA). A comprehensive record of the world's nucleotide sequencing information, including raw sequencing data.
- GenBank. The NIH genetic sequence database.
- IntAct Molecular Interactions Database. A database system and analysis tool for molecular interaction data.
- Nucleic Acid Database (NDB). Provides access to information about 3D nucleic acid structures and their complexes.
- Peptide Atlas. A compendium of peptides identified in a large set of tandem mass spectrometry experiments.
- Protein Data Bank (PDB). Stores archive information about the 3D shapes of proteins, nucleic acids, and complex assemblies.
- UniProtKB (UniProt Knowledgebase). Contains functional information on proteins.