TY - JOUR
T1 - A high-quality genome assembly of the north American song sparrow, melospiza melodia
AU - Louha, Swarnali
AU - Ray, David A.
AU - Winker, Kevin
AU - Glenn, Travis C.
N1 - Publisher Copyright:
© 2020 Louha et al.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020/4/1
Y1 - 2020/4/1
N2 - The song sparrow, Melospiza melodia, is one of the most widely distributed species of songbirds found in North America. It has been used in a wide range of behavioral and ecological studies. This species' pronounced morphological and behavioral diversity across populations makes it a favorable candidate in several areas of biomedical research. We have generated a high-quality de novo genome assembly of M. melodia using Illumina short read sequences from genomic and in vitro proximity-ligation libraries. The assembled genome is 978.3 Mb, with a physical coverage of 24.9·, N50 scaffold size of 5.6 Mb and N50 contig size of 31.7 Kb. Our genome assembly is highly complete, with 87.5% full-length genes present out of a set of 4,915 universal single-copy orthologs present in most avian genomes. We annotated our genome assembly and constructed 15,086 gene models, a majority of which have high homology to related birds, Taeniopygia guttata and Junco hyemalis. In total, 83% of the annotated genes are assigned with putative functions. Furthermore, only ∼7% of the genome is found to be repetitive; these regions and other non-coding functional regions are also identified. The high-quality M. melodia genome assembly and annotations we report will serve as a valuable resource for facilitating studies on genome structure and evolution that can contribute to biomedical research and serve as a reference in population genomic and comparative genomic studies of closely related species.
AB - The song sparrow, Melospiza melodia, is one of the most widely distributed species of songbirds found in North America. It has been used in a wide range of behavioral and ecological studies. This species' pronounced morphological and behavioral diversity across populations makes it a favorable candidate in several areas of biomedical research. We have generated a high-quality de novo genome assembly of M. melodia using Illumina short read sequences from genomic and in vitro proximity-ligation libraries. The assembled genome is 978.3 Mb, with a physical coverage of 24.9·, N50 scaffold size of 5.6 Mb and N50 contig size of 31.7 Kb. Our genome assembly is highly complete, with 87.5% full-length genes present out of a set of 4,915 universal single-copy orthologs present in most avian genomes. We annotated our genome assembly and constructed 15,086 gene models, a majority of which have high homology to related birds, Taeniopygia guttata and Junco hyemalis. In total, 83% of the annotated genes are assigned with putative functions. Furthermore, only ∼7% of the genome is found to be repetitive; these regions and other non-coding functional regions are also identified. The high-quality M. melodia genome assembly and annotations we report will serve as a valuable resource for facilitating studies on genome structure and evolution that can contribute to biomedical research and serve as a reference in population genomic and comparative genomic studies of closely related species.
KW - Assembly
KW - De novo
KW - Dovetail genomics
KW - Melospiza melodia
KW - Passeriformes
KW - Sequencing
KW - Whole genome
UR - http://www.scopus.com/inward/record.url?scp=85083536391&partnerID=8YFLogxK
U2 - 10.1534/g3.119.400929
DO - 10.1534/g3.119.400929
M3 - Article
C2 - 32075855
AN - SCOPUS:85083536391
VL - 10
SP - 1159
EP - 1166
JO - G3: Genes, Genomes, Genetics
JF - G3: Genes, Genomes, Genetics
SN - 2160-1836
IS - 4
ER -