Download HOMD Data
Taxonomic Data::Batch Downloads
Abundance Data::Batch Downloads [page]
Genomic Meta Information::Batch Downloads
Type | Formats | ||
---|---|---|---|
Sequence Meta Information [page] | Tab Delimited Text (View in browser) | Tab Delimited Text (Save to file) | MS Excel Format (Save to file) |
NCBI Genome Annotations | [FTP Site for download] |
PROKKA Genome Annotations | [FTP Site for download] |
Genomic Trees: Conserved Protein, Ribosomal Protein and 16S rRNA | [FTP Site for download] |
Phylogenetics
Type | Formats | ||
---|---|---|---|
Phylogenetic tree of HOMD 16S rRNA Reference Sequences | svg | newick | Sequence ID table |
Phylogenetic tree of 16S rRNA gene sequences identified in HOMD genomes | svg | newick | Sequence ID table |
Phylogenomic tree based on conserved proteins identified from HOMD genomes | svg | newick | Sequence ID table |
Phylogenetic tree of ribosomal proteins identified from HOMD genomes | svg | newick | Sequence ID table |
Download HOMD 16S rRNA Gene Sequences
HOMD provides two different sets of 16S rRNA Gene Reference Sequence (RefSeq) for download and BLAST search:
In addition, the entire collection of all cloned sequences (clonal collection) upon which the Refseq was derived from, is also provided for download on this page and on the HOMD FTP site.
1. HOMD 16S rRNA RefSeq: This set contains sequences representing all currently named and unnamed oral taxa. 2. HOMD 16S rRNA Extended RefSeq: This set contains additional16S rRNA reference gene sequences that are distinctively different from existing taxa but have not yet been assigned with a taxon ID.A phylogeny-based, high-resolution, habitat-specific training reference sequence set was constructed to achieve species/supraspecies-level taxonomic assignment to short- and long-read 16S rRNA gene-derived amplicon sequence variants (ASVs). The training set sequences, together with the pipeline scripts, are available for download. For detail description of the training set and the pipelines, please refer to our publication Escapa et al, 2020.
In addition, the entire collection of all cloned sequences (clonal collection) upon which the Refseq was derived from, is also provided for download on this page and on the HOMD FTP site.
[View Version History]
* These sequences are corrected consensus sequences. Many have been corrected and extended based on alignment with other sequences for that taxon and Ns and indels removed. Therefore, for many sequences, there will be differences between the Reference Sequence and the GenBank sequence listed in the header information.
We have not yet updated our own GenBank sequences, and can not update those from other depositors. We believe these are currently the best reference sequences available, and for the purposes of BLAST analysis, have the advantage of being of a uniform length.
HOMD Database Schema (updated 2022-06-01)
Open Schema Diagram (pdf) |
Download Schema (mysqldump --no-data) |