An international consortium of scientists is proposing what is arguably the most ambitious project in the history of biology: sequencing the DNA of all known eukaryotic species on Earth.
The benefits of the monumental initiative promise to be a complete transformation of the scientific understanding of life on Earth and a vital new resource for global innovations in medicine, agriculture, conservation, technology and genomics.
The central goal of the Earth BioGenome Project is to understand the evolution and organization of life on our planet by sequencing and functionally annotating the genomes of 1.5 million known species of eukaryotes, a massive group that includes plants, animals, fungi and other organisms whose cells have a nucleus that houses their chromosomal DNA. To date, the genomes of less than 0.2 percent of eukaryotic species have been sequenced.
The project also seeks to reveal some of the estimated 10 million to 15 million unknown species of eukaryotes, most of which are single cell organisms, insects and small animals in the oceans. The genomic data will be a freely available resource for scientific discovery and the resulting benefits shared with countries and indigenous communities where biodiversity is sourced. Researchers estimate the proposed initiative will take 10 years and cost approximately $4.7 billion.
In a perspective paper published today (April 23) in the Proceedings of the National Academy of Sciences (PNAS), 24 interdisciplinary experts comprising the Earth BioGenome Project Working Group, provide a compelling rationale for why the project should go forward and outline a roadmap for how it can be achieved.
Harris A. Lewin, a distinguished professor of evolution and ecology at UC Davis, chairs the working group and is the lead author of the paper. Gene Robinson, director of the Carl R. Woese Institute for Genomic Biology at the University of Illinois, and W. John Kress, research botanist and curator at the Smithsonian Institution, are co-chairs.
In the paper, the scientists point to the hugely successful precedent of the Human Genome Project. Launched in 1990 and completed in 2003, the United States and funding agencies in other countries invested approximately $3 billion to sequence the entire human genome. The resulting “genomic revolution” has had an enormous impact not only on human medicine but also on veterinary medicine, agricultural bioscience, biotechnology, environmental science, renewable energy, forensics and industrial biotechnology. A 2013 report by the Battelle Memorial Institute estimated the financial benefit of the Human Genome Project to the U.S. economy to be nearly $1 trillion.
Lewin sees the Earth BioGenome Project as providing even greater opportunities for generating scientific and societal benefits.
“The EBP will lay the scientific foundation for a new bio-economy that has the potential to bring innovative solutions to health, environmental, economic and social problems to people across the globe, especially in under-developed countries that have significant biodiversity assets,” he said.
The project first emerged in 2015 at a meeting organized by Lewin, Robinson and Kress, followed by another meeting organized as part of the Smithsonian Initiative on Biodiversity Genomics. After the completion of the Human Genome Project, many organisms of biomedical, agricultural and industrial importance had their genomes sequenced. The attendees at the 2015 meeting decided that an even more ambitious project was needed to advance biology, one that would sequence DNA from all complex life on Earth.
Advances in technology have made the project feasible. The cost of whole genome sequencing has declined to about $1,000 for a draft-quality sequence of human genome size and about $30,000 for a reference-quality assembly of the chromosomes of an average eukaryotic genome.
With advances in high-performance computing, data storage and bioinformatics, high throughput assembly and characterization of genomes is now feasible, although innovations in algorithms for aligning, interpreting and visualizing the massive amounts of data will be necessary. The completed project is expected to require about 1 exabyte (1 billion gigabytes) of digital storage capacity.
Addressing Critical Needs
The project also addresses several critical needs. One is the need for better conservation tools for endangered species and ecosystems, particularly those impacted by climate change.
“The Earth BioGenome Project will give us insight into the history and diversity of life and help us better understand how to conserve it,” Robinson said.
The working group also see the project as being essential for developing new drugs for infectious and inherited diseases as well as creating new biological synthetic fuels, biomaterials, and food sources for the anticipated human population of 9.6 billion by 2050.
“Scientists believe that by the end of the century more than half of all species will vanish from the face of the Earth, and with consequences to human life that are unknown, but are potentially catastrophic,” Lewin said.
To help achieve its vision, the Earth BioGenome Project is developing an array of global partnerships and strategies.
The organizational structure of the project will consist of a “global network of communities,” each community contributing to the project and following the project’s protocols and standards. The project has partnered with Global Genome Biodiversity Network, the world’s major resource of tissues and DNA from voucher specimens. It is also forging partnerships with communities of scientists working on different groups of organisms, including the Vertebrate Genomes Project, the Global Invertebrate Genome Alliance, the 10,000 Plant Genomes Project, the 5000 Insect Genomes Project, and others.
Assembling the species will be a massive undertaking, which is why partnerships with institutions that procure and preserve the Earth’s biodiversity, such as natural history museums, botanical gardens, zoos and aquaria, will be crucial for success. The Smithsonian herbarium, for example, contains around 300,000 species.
“Many scientists at the Smithsonian Institution with its 19 museums and nine research institutes are applying genomics technologies in their research to increase our understanding of the natural world. The strength of biodiversity genomics at the Smithsonian is a good indicator of the vital role the institution will play in furthering the goals of the Earth BioGenome Project,” Kress said.
The Earth BioGenome Project also plans to capitalize on the “citizen scientist” movement to collect specimens, modeled after the University of California Conservation Genomic Consortium’s CALeDNA program. The project is likely to enable the development of new technologies, such as portable genetic sequencers and instrumented drones that can go out, identify samples in the field, and bring those samples back to the laboratory.
The organizers also want to ensure that any benefits arising from the project are equitably shared with stakeholders, such as the countries that harbor the genetic resources the project will need. A partnership between the Earth BioGenome Project and the Earth Bank of Codes was announced at the annual meeting of the World Economic Forum in Davos, Switzerland, in January 2018. The Earth Bank of Codes aims to make nature’s biological and biomimetic assets accessible to scientists and innovators around the world, while tackling bio-piracy and ensuring fair and equitable sharing of the commercial benefits that may ensue, in alignment with the Convention of Biodiversity’s Nagoya Protocol.
A pilot program has been initiated in conjunction with the Amazon Bank of Codes and the World Economic Forum. Brazil contains approximately 10 percent of the world’s total biodiversity. The project will offer indigenous and traditional communities in the Amazon Basin an opportunity to reap a fair share of the economic value generated from the use of biological data and natural assets from their local biomes. If successful, the pilot program will serve as the foundation for other countries with rich biodiversity.
Earth BioGenome Working Group
Harris A. Lewin, University of California, Davis; Gene E. Robinson, University of Illinois at Urbana–Champaign; W. John Kress, Smithsonian Institution; William J. Baker, Royal Botanic Gardens, Kew; Jonathan Coddington, Smithsonian Institution; Keith A. Crandall, George Washington University; Richard Durbin, University of Cambridge and the Wellcome Sanger Institute; Scott V. Edwards, Harvard University; Félix Forest, Royal Botanic Gardens, Kew; M. Thomas P. Gilbert, University of Copenhagen, Norwegian University of Science and Technology; Melissa M. Goldstein, Milken Institute School of Public Health; Igor V. Grigoriev, DOE Joint Genome Institute and University of California, Berkeley; Kevin J. Hackett, U.S. Department of Agriculture; David Haussler, University of California, Santa Cruz; Erich D. Jarvis, The Rockefeller University; Warren E. Johnson, Smithsonian Institution; Aristides Patrinos, Novim, Santa Barbara, California; Stephen Richards, Baylor College of Medicine; Juan Carlos Castilla-Rubio, Space Time Ventures and World Economic Forum’s Global Future Council on Environment and Natural Resource Security; Marie-Anne van Sluy, Universidade de SãoPaulo and SãoPaulo Research Foundation (FAPESP); Pamela S. Soltis, University of Florida; Xun Xu, China National Genebank; Huanming Yang, BGI-Shenzhen; Guojie Zhang, University of Copenhagen, BGI.