Failure, Persistence, and Breakthroughs

Behind every difficult genome is a story of persistence. Some challenges are solved through technical innovation, others through collaboration, patience, and a willingness to try again when the first approach fails. In this section, EBP contributors share lessons learned from setbacks, unexpected breakthroughs, and the long path from problem to solution.

**Kerstin Howe** in her office (with her favourite painting of lichens by Samantha Clark and a Hi-C map of a tetraploid assembly on her screen).

What have you learned from working on hard genomes?

Kerstin Howe: I have learned about the importance of patience and collaboration.

In the Tree of Life programme at the Sanger Institute, we are working on many Biodiversity Genomics projects, for instance the Darwin Tree of Life programme, Project Psyche, AEGIS, and the Aquatic Symbiosis Genomics Project. Since we are producing genome assemblies at scale – currently up to 40 per week – we can afford the luxury of sidelining certain species or even whole clades that are recalcitrant to current methods of extraction, sequencing, or assembly and instead turn to others. It’s better to be patient, pause things and invest in R&D, rather than running the risk of using up all available material and/or driving your lab scientists and bioinformaticians insane.

Our R&D never sleeps and we work closely with others in the field, both in-house and internationally, to constantly exchange the latest advances and enable previously impossible things. This way we found out that some things we thought we’d never get sequenceable DNA out of are actually good to go now. Even ultra-low amounts of DNA are sequenceable when suitably amplified. Many people contributed to identifying the ideal enzymes and conditions for this and are still working on further optimisations. When it comes to sequencing, it turns out that no single technology rules them all, you have to pick and sometimes mix to get the best outcome. And we’re all benefitting from the constant improvements in algorithms and available software that allow us to correctly piece together what was left terribly fragmented and misordered not long ago. The EBP provides a fantastic ecosystem for swift dissemination of knowledge on the latest successes and failures so that new approaches can immediately benefit many projects out there.

Howard et al 2025

Are we approaching a point where sequencing is no longer the bottleneck, but assembly and interpretation are?

Giulio Formenti: For many genomes, especially standard vertebrate genomes, we are rapidly approaching that point. Sequencing technologies have improved dramatically, and generating sufficient long-read data is increasingly straightforward compared to just a few years ago. The major bottlenecks are shifting toward assembly, curation, and interpretation, particularly for repetitive, heterozygous, polyploid, or structurally complex genomes. Producing truly complete references still often requires substantial manual review and expert interpretation. In many ways, the challenge is no longer generating data, but converting that data into biologically accurate and fully resolved genome representations at scale. Solving that bottleneck will require the next generation of assembly algorithms, graph-based methods, and likely AI-assisted genome curation.

**Giulio Formenti** is a Research Assistant Professor at The Rockefeller University, Co-Director and Bioinformatics Lead of the Vertebrate Genome Laboratory, and Chair of the Assembly Group for the Vertebrate Genomes Project (VGP).

**Tara Paton** (seated) and **Sachin Desai** (standing) monitor a sequencing run, tracking flow cell occupancy, data output at key time points, and DNA fragment length. Team members not pictured: Lan He, Sanjeev Pullenayegum, and Karen Ho.

Can you recall a moment when a “hard” genome finally came together?

Tara Paton: It is always top of mind for us how precious the samples we receive are. They may be from a species that is endangered or threatened, a biologist may have had to go to extraordinary lengths to collect the sample, or the specimen has very little material to start with. Further, many species have special significance to the communities to which they belong. We have all heard the term “personalized medicine” and for this project, we really are doing “personalized sequencing”! Each DNA sample is examined carefully before any experiments are started. The questions we might ask are: How much DNA is there? What is its quality/size? Are there known issues with the specimen's collection or storage? What are the sequencing goals? Is the organism's DNA possibly refractory to sequencing, such as plants or marine organisms? Considering these factors, the lab can formulate a plan for each sample that will maximize data acquisition success.

**Orange-footed sea cucumber** (*Cucumaria frondosa*) on the floor of the St. Lawrence River. Sequencing this species required the development of new DNA-cleaning approaches after initial sequencing attempts produced almost no usable data.

A species that stands out for us in terms of a technical breakthrough would have to be the orange-footed sea cucumber (Cucumaria frondosa)! It was one of those DNA samples where the library preparation seemed to go perfectly well, but produced almost no sequencing data at all. It was early in the project and we had not encountered a sample like this before. Through an iterative process of DNA cleaning methods using a new sample, we finally generated sequence and I nearly danced through the lab with joy! Since then, we have refined the method to preserve the DNA length as much as possible and we are very happy with the results. We have made similar breakthroughs with other marine organisms, plants and some insects. We have exceptionally talented laboratory staff who work tirelessly to produce quality data for this project and I am very proud of them. Some specimens continue to challenge us and I have no doubt that our toolkit for dealing with them will continue to grow.

How important is persistence in this line of work?

Amy Denton: Persistence is very important, but patience is also very important, as often it can take time for difficult genomes to be completed! In my four years working in the Tree of Life Core Laboratory, being persistent and not giving up on samples had enabled several breakthroughs and the generation of several genomes we wouldn’t have thought possible a few years ago, but being patient has made sure I have made the right decisions on how to progress samples. Sometimes for a species you might need to just try a different tissue type to get the DNA needed for reference genome generation, or other times you might need to wait until there are developments in the sequencing technologies available. Sometimes the development of a new protocol for another difficult species can also benefit another – the Modified Omega Bio-Tek protocol was initially developed for high molecular weight DNA extraction of jellyfish, however we have also found it has helped extract HMW DNA from other metazoa species such as the acorn worm Sacroglossus kowalevskii which had been in R&D for three years! I’m hopeful that remaining both persistent and patient will enable me to continue to make breakthroughs like this and enable reference genomes of difficult species to be generated in the years to come.

back to main