The Vertebrate Genomes Project (VGP), a project of the G10K Consortium, aims to generate at least one high-quality, error-free, near gapless, chromosome-level, haplotype phased, and annotated reference genome assembly for all extant vertebrate species. These genomes will be used to address fundamental questions in biology and disease, to identify species most genetically at risk for extinction, and to preserve genetic information of life. The VGP will be completed based on taxonomic hierarchy, which is a relative ranking of a group of organisms beginning with the largest classification, domain, to the smallest classification, species: Orders (Phase 1), Families (Phase 2), and Genera (Phase 3) to eventually all species (Phase 4). Species selection is based on a combination of criteria, including those with existing draft genomes in need of improvement, those with specialized traits that inform us about human biology, those in immediate danger of becoming extinct, and those with prominent use in biomedical research. Endangered species are a high priority and critical because our planet is experiencing its 6th mass extinction event, the worst since the die-off of dinosaurs 66 million years ago. This priority is in part due to human influence on pollution, habitat destruction, and climate change.
The VGP intends to use the genomic data that it produces for multiple studies, such as the following:
Genome-scale family tree of vertebrates.
Comparative genomics of convergent traits (e.g. vocal learning, flight, loss of limbs, and aquatic / terrestrial adaptations).
Developing universal vertebrate gene orthology and nomenclature.
Deciphering vertebrate chromosomal genome evolution.
Reconstruction of the common ancestor genomes of all vertebrates and of key vertebrate clases (e.g. mammals, birds, reptiles, amphibians, teleost, bony vertebrates, jawed vertebrates, and tetrapods).
Complete Vertebrate Mitogenomes Reveal Widespread Repeats and Gene Duplications. Formenti, Giulio, Arang Rhie, Jennifer Balacco, Bettina Haase, Jacquelyn Mountcastle, Olivier Fedrigo, Samara Brown, Marco Rosario Capodiferro, Farooq O. Al-Ajli, Roberto Ambrosini, Peter Houde, Sergey Koren, Karen Oliver, Michelle Smith, Jason Skelton, Emma Betteridge, Jale Dolucan, Craig Corton, Iliana Bista, James Torrance, Alan Tracey, Jonathan Wood, Marcela Uliano-Silva, Kerstin Howe, Shane McCarthy, Sylke Winkler, Woori Kwak, Jonas Korlach, Arkarachai Fungtammasan, Daniel Fordham, Vania Costa, Simon Mayes, Matteo Chiara, David S. Horner, Eugene Myers, Richard Durbin, Alessandro Achilli, Edward L. Braun, Adam M. Phillippy, and Erich D. Jarvis. 2021. Genome Biology 22(1):120.
Towards Complete and Error-Free Genome Assemblies of All Vertebrate Species. Rhie, Arang, Shane A. McCarthy, Olivier Fedrigo, Joana Damas, Giulio Formenti, Sergey Koren, Marcela Uliano-Silva, William Chow, Arkarachai Fungtammasan, Juwan Kim, Chul Lee, Byung June Ko, Mark Chaisson, Gregory L. Gedman, Lindsey J. Cantin, Francoise Thibaud-Nissen, Leanne Haggerty, Iliana Bista, Michelle Smith, Bettina Haase, Jacquelyn Mountcastle, Sylke Winkler, Sadye Paez, Jason Howard, Sonja C. Vernes, Tanya M. Lama, Frank Grutzner, Wesley C. Warren, Christopher N. Balakrishnan, Dave Burt, Julia M. George, Matthew T. Biegler, David Iorns, Andrew Digby, Daryl Eason, Bruce Robertson, Taylor Edwards, Mark Wilkinson, George Turner, Axel Meyer, Andreas F. Kautt, Paolo Franchini, H. William Detrich, Hannes Svardal, Maximilian Wagner, Gavin J. P. Naylor, Martin Pippel, Milan Malinsky, Mark Mooney, Maria Simbirsky, Brett T. Hannigan, Trevor Pesout, Marlys Houck, Ann Misuraca, Sarah B. Kingan, Richard Hall, Zev Kronenberg, Ivan Sović, Christopher Dunn, Zemin Ning, Alex Hastie, Joyce Lee, Siddarth Selvaraj, Richard E. Green, Nicholas H. Putnam, Ivo Gut, Jay Ghurye, Erik Garrison, Ying Sims, Joanna Collins, Sarah Pelan, James Torrance, Alan Tracey, Jonathan Wood, Robel E. Dagnew, Dengfeng Guan, Sarah E. London, David F. Clayton, Claudio V. Mello, Samantha R. Friedrich, Peter V. Lovell, Ekaterina Osipova, Farooq O. Al-Ajli, Simona Secomandi, Heebal Kim, Constantina Theofanopoulou, Michael Hiller, Yang Zhou, Robert S. Harris, Kateryna D. Makova, Paul Medvedev, Jinna Hoffman, Patrick Masterson, Karen Clark, Fergal Martin, Kevin Howe, Paul Flicek, Brian P. Walenz, Woori Kwak, Hiram Clawson, Mark Diekhans, Luis Nassar, Benedict Paten, Robert H. S. Kraus, Andrew J. Crawford, M. Thomas P. Gilbert, Guojie Zhang, Byrappa Venkatesh, Robert W. Murphy, Klaus-Peter Koepfli, Beth Shapiro, Warren E. Johnson, Federica Di Palma, Tomas Marques-Bonet, Emma C. Teeling, Tandy Warnow, Jennifer Marshall Graves, Oliver A. Ryder, David Haussler, Stephen J. O’Brien, Jonas Korlach, Harris A. Lewin, Kerstin Howe, Eugene W. Myers, Richard Durbin, Adam M. Phillippy, and Erich D. Jarvis. 2021. Towards Nature 592(7856):737–46.
The VGP assembly pipeline
A new, VGP-inspired conceptual data framework
The GenomeArk. The official data repository of the Vertebrate Genomes Project