Abstracts
University of Florence | Biodiversity Genomics Europe
Navigating the European Reference Genome Atlas
The European Reference Genome Atlas (ERGA) is a network of over 700 researchers dedicated to cataloguing the eukaryotic biodiversity present in Europe by generating high-quality chromosome-level reference genomes. ERGA is an open community that brings together scientists of diverse origins, backgrounds, career stages, and areas of expertise to advance genomic research in a distributed, decentralised way. Everyone is welcome to register and become an ERGA member, but how can you take the next step to truly engage with the community and benefit from it? In this talk, we will introduce the structure of our community and bring examples of researchers and projects that benefited from ERGA. We will also highlight some initiatives that have fostered connections, collaborations, and facilitated knowledge sharing within ERGA and beyond. Finally, we will showcase examples of innovative communication channels that are promoting connections between scientists and beyond. Regardless of your level of experience with generating, analysing, and applying reference genomes, if these topics interest you, come and you are invited to join the ERGA mission!
Leibniz Institute for the Analysis of Biodiversity Change, Museum Koenig Bonn
Sampling for reference genome sequencing across Europe with community participation
Sampling for genome sequencing is crucial to the advancement of the Biodiversity Genomics Europe (BGE) project, ensuring the collection and delivery of high-quality biological material for reference genome generation across Europe, with a strong emphasis on community involvement. We target critical biodiversity areas and biodiversity hotspots, with a special focus on pollinators, to achieve comprehensive coverage and representation of European eukaryotic species. This talk will provide a concise overview of the species nomination process, prioritisation steps, sample transfer workflow, morphological vouchering, biobanking collection, metadata obtention and data management. Led by LIB-ZFMK, and involving 5 other partner institutions, the sampling workflow is an example of a collaborative effort to preserve and understand Europe's biodiversity, balancing community participation with scientific rigour.
Centro Nacional de Análisis Genómico
The ERGA Genome Tracking Console
The Earth BioGenome Project (EBP) represents a monumental global effort to sequence, catalog, and analyze the genomes of all known eukaryotic species on Earth. Achieving this 'moonshot' goal necessitates a high level of international cooperation and coordination. The European Reference Genome Atlas (ERGA), the European node of the EBP, coordinates the sampling, sequencing, and bioinformatics communities in Europe. Through the Biodiversity Genomics Europe (BGE) project, ERGA aims to generate reference genome assemblies for several hundred species, more concretely 450 Gb of assembly span. Initially, species lists, genome data, and metadata were managed using shared spreadsheets and emails, which can be prone to errors and lead to miscommunication. To streamline coordination and minimize these problems, we developed the ERGA Genome Tracking Console (ERGA-GTC), a comprehensive online tool implemented using the Django web framework. ERGA-GTC tracks progress from species selection through sample acquisition, sequencing, assembly, and annotation. It integrates with systems such as GoaT, the Collaborative Open Plant Omics (COPO) platform, and the ERGA Data Portal. The ERGA-GTC manages species lists, sample collection details, deadlines for sample providers, and barcoding and biobanking status. It also facilitates coordination with sequencing centers and genome assembly teams and monitors the status of read data and assembly submissions to ensure timely data submission to the ENA. By providing a unified platform for tracking and managing the components of the ERGA-BGE project, ERGA-GTC significantly enhances project efficiency and fosters the collaboration needed to achieve the ambitious objectives of the EBP.
Spanish National Research Council / University of Minho
Barcode reference library curation
As metabarcoding applications continue to grow at a global scale, there is an increasing need for tightly curated barcode reference libraries, to be able to translate metabarcode sequence output into reliable species-level inventories. Locally developed reference libraries are desirable, but not always achievable, and even when available are unlikely to be complete. Global DNA sequence repositories, such as the Barcode of Life Database (BOLD) and INSDC, are powerful resources for the assignment of taxonomic identity to metabarcode sequence output. While increasingly rich in species-level barcode records, there are a number of challenges that must be confronted to move from repositories to curated reference libraries. These challenges include, among other things: taxonomic misspellings and synonyms, taxonomic error, mislabelling, sequence contamination, and non-diagnosability of species through shared genetic variation. The European projects Biodiversity Genomics Europe and eDNAqua-Plan are both addressing these challenges to design a digital ecosystem of DNA sequence repositories and European-level curated reference libraries, with a focus on aquatic organisms and biomonitoring in the latter case. We aim at a sustainable and reliable infrastructure adhering to FAIR (Findable, Accessible, Interoperable, and Reusable) principles, thereby leading the way for more effective conservation and management strategies. In this presentation we will discuss how these challenges are being addressed and progress to date.
Naturalis Biodiversity Center
Scaling up DNA barcoding using Oxford Nanopore sequencing
As species extinction rates accelerate the urgency to document and understand biodiversity has never been more critical. With the growing need for high-resolution biodiversity data in ecological assessments, DNA barcoding and metabarcoding have become important tools in the monitoring toolbox. However, to successfully employ these techniques, more complete reference databases and more resource-efficient methods are needed. Novel techniques, such as Oxford Nanopore sequencing, can help make these processes more cost and time efficient. Within the ARISE program, a large-scale project dedicated to building research infrastructure to enable (semi)automatic identification of all multicellular life in the Netherlands, we have revised the whole DNA barcoding workflow, cutting costs for single-specimen sequencing by more than 80%. Without the constraints of Sanger sequencing, new and exciting possibilities have opened up, such as using longer marker regions, or co-sequencing host and endosymbiont species simultaneously. Similar developments are happening around the world, and we will highlight some of these, hoping to inspire many more people to adopt these novel methods.
Natural History Museum, London
Skimming at scale: bringing historic collections into the genomic era
It has been estimated that globally natural history collections contain over 1 billion specimens. Vast numbers of these specimens were collected before the structure of DNA was known, long before we learned to sequence DNA. These specimens are witnesses to global change and in many cases the only representatives of the species known. They have been patiently waiting to reveal their secrets, and this is now becoming feasible at scale. Our approach has been to combine minimally destructive and ancient DNA extraction methods with ssDNA and dSDNA library preparation, and to scale these methods down to suit small tissue samples, thereby dropping consumable costs while enabling robotics to increase throughput. When combined with highly multiplexed sequencing the cost per specimen drops to the point where museum samples can be incorporated into a myriad of studies, with uses far beyond the original goals of those who collected them many decades ago.
Leibniz Institute for Zoo and Wildlife Research
Scaling-Up the Production and Impact of Reference Genomes through the BGE Genome Streams
The BGE Genome Sequencing Stream aims to significantly enhance the production and impact of high-quality reference genomes within the European Reference Genome Atlas (ERGA), aligning with global and regional biogenomics initiatives. This stream facilitates strategic planning, collaboration, and production of genomic data through enhanced synergies within and across disciplines. Key achievements include establishing ERGA as the European node for the Earth BioGenome Project (EBP), developing data protection and governance structures, and creating comprehensive codes of conduct and open data policies. By connecting sampling, sequencing, and data-processing workflows, the Genome stream has delivered high-quality genomic data and pioneered new protocols. It has also advanced the application of genomic data in biodiversity characterization and conservation. Despite initial bottlenecks with logistics in a pan-European distributed context, strategic adjustments have been implemented to ensure continued progress. This structured approach is pivotal for future scaling up in the production of reference genomes, thereby enhancing our understanding and conservation of biodiversity on a pan-European and global scale.
Royal Botanic Garden Edinburgh
Establishing a distributed DNA barcoding production pipeline in Europe through the BGE Barcoding Stream
The Biodiversity Genomics Europe DNA Barcoding Stream aims to support the production and utilisation of DNA barcodes in Europe. This includes coordination of networking activities, and then establishing a production flow ranging from sample selection and acquisition, DNA sequencing, data processing and analyses, through to practical applications. Key highlights from the first 18 months of BGE include establishing a European node for the International Barcode of Life (iBOL) and evaluating optional approaches for establishing national DNA barcoding networks and structures.A gap analysis has been undertaken to guide sample prioritisation for reference library construction, and work commenced undertaking large scale sampling in the field and from museum collections. DNA sequencing protocols have been optimised across different sequencing centres, with a key focus being to drive down the costs and increase the throughput of museum specimen barcoding. A European mirror of the Barcode of Life Datasystem (BOLD) has been established and work is underway to further develop BOLD and the wider DNA barcoding informatics landscape in Europe. Finally, pipelines are being established to support curation of DNA barcode reference libraries for taxa important for biomonitoring, and more generally to provide user friendly DNA barcode workflows for meta-barcoding analyses.
Tree of Life, Wellcome Sanger Institute
Cnidarian wet lab workflows to generate high quality jellyfish and coral genomes for the Aquatic Symbiosis Genomics project
The Aquatic Symbiosis Genomics (ASG) project is producing high quality reference genomes from a diverse group of freshwater and marine organisms, representing approximately 500 symbiotic relationships. These samples are collected by global hubs, and shipped to the Sanger Institute where they are handled in the Tree of Life Core Laboratory. To date, 493 species have been processed, with the aim of extracting DNA of sufficient quality and quantity for long read sequencing. The Cnidaria (jellyfish and corals) have been particularly difficult samples to process, with DNA extractions generally producing low yield and poor quality DNA (jellyfish) or high yield and poor quality DNA (corals). Here we present results of extensive R&D work that has proved successful in improving DNA extractions for these two Cnidarian groups. In particular, we’ll introduce a DNA extraction method that has improved jellyfish DNA yields and quality, and outline how the introduction of bead beating has resulted in efficient sample disruption and downstream DNA extraction in corals.
University of California, Merced
Symbioses in 3D: the diversity and dynamics of pelagic symbioses
The lit and unlit pelagic zones present numerous physiological and biological challenges to which animals exhibit various adaptations. These adaptations give pelagic fauna powerful ecosystem roles as predators and prey, capable of altering ecosystem structures. For some animals, symbiosis is a crucial adaptation to occupying this niche, with various biological implications including nutrition, communication, and development. However, much is still unknown about the role of symbiosis in the evolution, ecology, and biology of pelagic animals. Our global team of marine ecologists, evolutionary biologists, natural historians, and genomicists with a shared interest in how pelagic organisms survive and prosper in coastal and open oceans are studying four major lineages of pelagic animals (pyrosomes, medusozoans, ctenophores, acoels) and their symbionts (proteobacteria, zooxanthellae, flagellates, green algae). Our aim is to accelerate understanding of the eco-evolutionary assembly and disassembly of host-specific symbioses from a diverse pool of pelagic microbes, specificity and plasticity to changing environments, how symbioses influence ecosystem dynamics, and how these organism inform evolution and development of metazoans, including bilaterians and vertebrates.
University of Rhode Island
Speciation across depth gradients in reef corals
Adaptation to different light environments associated with varying depths could provide opportunities for ecological speciation in corals with photosynthetic symbionts and environment-mediated spawning events. Here, we show that differences in depth distribution are common in sister lineages of corals from a variety of taxonomic groups and locations. We then explore in detail the molecular drivers behind depth-associated adaptive divergence by documenting patterns of sequence divergence for proteins related to environmental sensing in depth-segregated and light-dependent lineages in the Orbicella species complex. Specifically, we analyzed species of Orbicella that exhibit genetic divergence across a depth gradient that may reflect an incipient recent (~500 Kyr). Genome-wide variation suggests divergence across depth occurred by adaptation via positive selection on rhodopsin-like G-protein-coupled receptors (GPCRs). These molecules are known to serve as chemo/photo/thermo-receptors mediating environmental transduction signals to enhance physiological adaptation across different environments while also being involved in reproductive isolation via differences in time of spawning in corals. Our study posits a molecular mechanism for the origin of depth-segregated coral species shared across the anthozoan tree of life in systems in which ecological divergence operates at spatial scales smaller (< 1 km) than their larval dispersal potential and highlights avenues contributing to generating biodiversity in the sea.
Marine Research Institute, Spanish Research Council (ABBR: IIM-CSIC)
NOD-like receptor diversity in Mediterranean sponge holobionts
In the last two decades, symbiosis research forced a shift paradigm: immunity role as the modulator of the microbiome and not only as a shield against pathogens. Beyond the bias towards pathogenicity, immunology has mainly focused on terrestrial animals. And yet, the chance that these animals will encounter with pathogens is rather low if compare to marine organisms, which live and evolve in a microbial ocean. This is particularly true for marine sponges (phylum Porifera): they encounter billions of microbial cells a day while they filter-feed but at the same time harbor specific, stable and complex microbiomes. How do sponges differentiate between beneficial, harmless or pathogenic microbes? Early sequencing studies revealed a complex repertoire of immune receptors in sponges, among these the NOD-like receptors (NLRs), with the potential to play a role in microbial recognition and specific discrimination. We have investigated the NLR repertoire in the sponge symbiosis model Dysidea avara. We identified > 150 NLR genes in D. avara genome. We found different genes with NACHT domain, but assembled in canonical and non-canonical combinations. NACHT domain-containing genes are found across all D. avara chromosomes. NLRs phylogeny showed two distinct clusters according to the 5’ terminal domain. We compared the diversity of NLR in D. avara to other sponge genomes and reconstructed evolutionary patterns. This new dataset will allow exploring the intraspecific patterns of NLR receptors in sponges and their implications in the ecological success of this species.
The University of Memphis
Convergent evolution of metabolic systems in phytophagous beetles (Coleoptera: Buprestidae and Cerambycidae)
About a quarter of the known animal species on Earth are phytophagous insects, totaling about 401,000 described species, with beetles representing a significant portion of this diversity. The majority of these phytophagous beetles belong to the clade Phytophaga (leaf beetles, longhorn beetles and weevils) and the family Buprestidae (metallic woodboring beetles). Plants serve as their primary food source, which presents challenges due to their rigid structure and recalcitrant cell wall, in addition to a lack of essential nutrients such as protein for growth and development. To address these challenges, phytophagous beetles have evolved the ability to degrade plant cell walls using either symbiont-derived or endogenous plant cell wall degrading enzymes (PCWDEs). Despite diverging from a common ancestor over 200 million years ago, Buprestidae and the Phytophaga have undergone convergent evolution in their metabolic systems, resulting in strikingly similar genomic repertoires of endogenous PCWDEs. However, the apparent evidence of convergence towards similar solutions to the same metabolic challenges between these families separated by such a large evolutionary time raises intriguing questions. Here, we present a new perspective on the relationship between enzyme structure and function, host plant preference, and the genomic cause and consequences of metabolic convergence in herbivorous beetles. To supplement this work, we are using emerging deep learning tools to understand the 3-D structures and evolution of these enzymes, with the goal of unraveling the intricacies of metabolic convergence in these distantly-related beetle families.
University of Arkansas
A genomic framework for chemosensory evolution in longhorn beetles
Exaggerated sensory antennae are a defining characteristic of the longhorned beetles (Cerambycidae), a diverse family of wood-boring insects that includes many destructive pests of trees. The chemical ecology of cerambycids is well-studied and spans an extensive array of host volatiles, long-range pheromones, and cuticular hydrocarbons, providing a novel but highly suitable system in which to explore the growing field of chemosensory genomics. Here, we present sequence data from whole genomes and antennal RNAseq of numerous cerambycid species, alongside our efforts to develop a combined approach that emphasizes: 1) the rapid and accurate identification of chemoreceptor genes in novel cerambycid genomes; 2) the evolution of chemoreceptors and pheromone receptors across the cerambycid subfamilies; and 3) the application of AI and deep learning to study the structural evolution of chemoreceptor proteins.
Department of Biological and Medical Sciences, Oxford Brookes University
The last days of Aporia crataegi (L.) in Britain: evaluating genomic erosion in an extirpated butterfly
Evaluating genomic erosion holds promise in helping both monitor and identify threatened species or populations. Such an approach could be particularly useful for invertebrates, which cannot always be easily monitored. Additionally, consistent patterns of genomic erosion in extinct or extirpated species can reveal hallmarks of the extinction process. The Black-veined White butterfly (Aporia crataegi) was extirpated from Britain in the 1920s. Here, we sequenced historical DNA from 17 museum specimens, using only one or two legs, collected between 1854-1924 to reconstruct demography and compare levels of genomic erosion between extirpated British and extant European mainland populations. We contrast our results for A. crataegi to those from modern samples of the Common Blue butterfly (Polyommatus icarus); a species with relatively stable demographic trends in Great Britain. We find evidence of bottlenecks and reduced genetic diversity in populations of both A. crataegi and P. icarus in Britain, consistent with a post-glacial colonisation history. However, more symptomatic of A. crataegi’s disappearance were significant increases in large runs-of-homozygosity (RoH), potentially indicative of recent inbreeding, and accumulation of putatively mildly and weakly deleterious variants, neither of which are observed for P. icarus. Our results suggest the metrics of genomic erosion hold potential for identifying small or threatened populations from a handful of individual genomes. Additionally, we show that population genomic investigations of >100 year old insect museum specimens can help shed light on extinction and extirpation processes.
University of Edinburgh
Assessing genetic diversity across DToL released flowering plant genomes
Genetic diversity determines the ability for species to evolve and adapt as well as the fundamental biology of species. Despite this, little is known about the drivers or processes behind the variation in genetic diversity we see in nature. Here, we use flowering plant genomes of species native to Britain and Ireland released by the Darwin Tree of Life (DToL) paired with resequenced genomes in order to understand differences in genetic diversity across flowering plant orders. First, comparative genomics of the DToL genomes demonstrates how diversity clusters relative to phylogenetic assignment. Using genome wide patterns of nucleotide diversity and heterozygosity across windows, we then present the distribution of Runs of Homozygosity (ROH) to test these findings. At a species level, we show that diversity does not cluster by depth of phylogenetic branching. Using population-level resequencing data for a subset of species across the phylogeny, we are then able to show that diversity patterns are broadly consistent between populations of the same species distributed across Britain. Our results provide evidence of the complex evolutionary processes governing patterns of diversity across native flora, and demonstrate the need for large scale studies across taxa when trying to determine the causes and consequences of genetic diversity seen across species and populations.
Biodiversity Genomics Laboratory, University of Neuchâtel, Switzerland
Comparative Genomics to Unravel Chromosomal (In)-Stability in Butterflies
Large-scale chromosomal rearrangements may act as reproductive barriers, contributing to the formation of new species. Lepidoptera, i.e. butterflies and moths, are among the most diverse taxonomic groups across the Tree of Life. They have holocentric chromosomes without defined centromeres and therefore, chromosomal fusions and fissions may be less deleterious and more likely to be retained. Accordingly, some butterfly clades show a tremendous diversity in chromosome numbers. These clades often show bursts of species diversity, suggesting a role of rearrangements in species diversification. To understand why some clades show multiple chromosomal fusions and fissions, while others have conserved chromosomes, we generated and compared chromosome-scale genome assemblies for Erebia butterflies. We ask if differences in rates of chromosomal rearrangements, associated with different rates of speciation between chromosomally stable and unstable lineages, are explained by (i) differential repetitive element expansions, (ii) differences in 3D genome structure, and/or (iii) differences in DNA repair genes. Overall, we are disentangling the association between genomic features, chromosomal rearrangements, and species diversification for species with holocentric chromosomes.
LOEWE Centre for Translational Biodiversity Genomics
Seven new reference-quality side-necked turtle genomes illuminate turtle evolution
Stem-turtles were the ultimate 'hopeful monsters'. Unprecedented reconfigurations of the tetrapod skeleton gave rise to their unique body plan, including one of the most iconic novelties: the turtle shell. Turtles further evolved extreme longevity and repeatedly switched from temperature-dependent, to genetic sex determination. However, studying the molecular basis of these novelties and adaptations was hampered by a lack of reference genomes for one of the two extant turtle clades: Side-necked turtles (Pleurodira). Thus, we generated seven new reference-quality side-necked turtle genomes to comprehensively analyze genes under positive selection, chromosomal rearrangements and gene losses in stem-turtles. By generating haplotype-resolved genome assemblies and re-sequencing males and females of genetically sex determining turtles, we further identify sex chromosomes and characterize their evolution in side-necked turtles. In contrast to a previous hypothesis, our data show that genetic sex determination does not co-evolve with chromosome numbers in turtles. Chelid sex chromosomes only originated once from a microchromosome >80 mya, but remained highly homomorphic. Macro-sex chromosomes evolved by subsequent chromosome fusion. In combination, we present compelling evidence against a long-standing hypothesis regarding genome evolution during temperature- to genetic genetic sex determination in turtles and identify key genomic changes linked to turtle innovations.
Wellcome Sanger Institute
One year on in the Tree of Life Core lab: continuous improvement from 2023/4
The Tree of Life Core laboratory provides wet-lab and sample processing support for all of the work within the programme. The Tree of Life programme is a major contributor to several Earth Biogenome Project affiliated projects - Darwin Tree of Life, Aquatic Symbiosis Genomics, Psyche, Vertebrate Genome Project, European Reference Genome Atlas and more, with a main emphasis on the production of reference level genome assemblies. Over the past year production has scaled dramatically, with larger numbers of species completing month on month. Several factors have contributed to this: - Improved data sharing and transparency - Auditing of systems to track samples through various pipelines - Initiating new workflows - such as PiMmS - R&D development tailored to taxonomic areas of interest - Processing ‘top-ups’ - using various strategies to complete data sets Here we provide details of work that has been instrumental towards improving success rates, reducing rework and completing more genome data sets than ever before. Importantly, we provide information on what we would recommend as others start this journey, as well as what to avoid - aiming to prevent others falling into the potholes that we have found.
Wellcome Sanger Institute
Project Psyche - Lepidopteran Genomes for Europe
Reaching the aims of the Earth Biogenome Project (EBP) to sequence all life on earth, requires large-scale sequencing initiatives that target specific taxonomic groups and geographic regions. Here, we present Project Psyche, a new sequencing initiative that focuses on Lepidoptera – butterflies and moths – in Europe (www.projectpsyche.org). Lepidoptera are a prime model system for evolutionary, ecological, conservation, agricultural and developmental biology studies and they represent key pollinators, herbivores, and prey for a range of predators. Some of them are important agricultural or household pests, others are used as biodiversity indicators and due to their beauty, Lepidoptera are ideal for outreach and environmental education. Project Psyche aims to generate chromosome-level reference genomes of all 11,000 species of Lepidoptera in Europe. Lepidoptera have small genomes that are easy to sequence and assemble to chromosome level and they will provide amazing opportunities for studying a wide range of questions. All genomes will be made openly accessible and analyses will be performed collaboratively. Project Psyche is complemented by LepEU, an initiative to generate population genomics datasets of butterflies across Europe. Together these two projects are supported by a COST Action that provides a framework for collaboration among researchers using these genomes, citizen scientists, and expert societies. Here we will present our plans, ideas for optimal collaboration, and results from a first set of 210 genomes.
University of the Andes. Colombia
Epic biodiversity, struggling genomics
Colombia is proudly one of the most biodiverse countries, hosting roughly 10% of all eukaryotic species on earth, and is famous for its high diversity of birds, butterflies, orchids, palms, fishes, amphibians, and other groups, spread across the Pacific and Amazonian rainforests, tropical grasslands, deserts, dry forests, páramos atop three chains of Andes, plus two oceans. Thus, Colombia should be an epicenter of biodiversity genomics (BG). This talk will provide details about the deep challenges for BG in Colombia and focus on empirical examples of successful projects to highlight solutions. Limited governmental research funding requires genomicists in Colombia to make more immediate links between reference genome and solving pressing social needs. For basic science and conservation, BG in Colombia relies very heavily on international collaboration and funding, and examples will be presented. Genome sequencing infrastructure in Colombia is limited, e.g., few Oxford Nanopore platforms, some Illumina, but no PacBio yet. Sadly, even with access to infrastructure, sequencing in-country always costs more than in a developed country. Colombia also hosts deep cultural diversity, including Afro-Colombian communities and over 100 indigenous groups. Simultaneous with the BG24 conference is the Convention on Biological Diversity’s COP16 hosted by Colombia, where the community of nations will make decisions about how access and benefit-sharing (ABS) between biodiversity stewards and the users of genomic data will be applied to digital sequence information (DSI). The decisions being made today may have a profound impact on how Colombian BG will be conducted tomorrow.
Pontificia Universidad Católica de Chile
The 1000 Genomes: Sequencing Chilean Biodiversity
The 1000 Genomes Project is a Chilean national initiative aimed at generating reference genomes for native biodiversity. Chile possesses a unique biodiversity with high endemism and species distributed across extreme environments, ranging from the Atacama Desert, salt flats, high altitudes in the Andes, and forests, to Antarctica. Chile also has an extensive marine coast across different latitudes along the Pacific coast with upwelling areas providing nutrients to marine organisms. Therefore, the genome allows us to understand how these species have adapted; generate information about their functioning; apply this knowledge in conservation actions; and obtain important information for economic sectors (e.g., agriculture, aquaculture) and public health. This is a national initiative led by the Millennium Institute Center for Genome Regulation (IM-CRG) in collaboration with other approximately seven Centers of Excellence in the fields of biodiversity, Antarctic research, molecular biology, mathematics modeling, and deep sea research. It is internationally associated with the Earth Biogenome Project (EBP). Different committees have been established contributing to different stages of the project workflow. This initiative involves citizen participation in voting on the species to be sequenced through the website, after being selected and listed on a long list by the taxonomic committee. We aim to train future scientists in universities as well as in schools through the 'Chile Sequences Chile' program. The 1000 Genomes project will also generate ethical and legal discussions on the topic, supporting public policies. This initiative places Chile as a pioneer in Latin America, with significant potential for discovering unique adaptive capacities.
Vale Institute of Technology
The Genomics of Brazilian Biodiversity Consortium
In January 2023, the 'Genomics of Brazilian Biodiversity' (GBB) consortium was launched in co-partnership between the Vale Technology Institute (ITV) and the Chico Mendes Institute of Biodiversity Conservation (ICMBio), the national authority in charge of developing, implementing, and managing policies promoting biodiversity conservation in Brazil. This research program aims to generate baseline genomic information to support conservation actions targeting threatened Brazilian biodiversity and the genetic knowledge for the enhancement of native species and varieties already linked or potentially relevant for the bioeconomy. In addition to ICMBio and ITV, dozens of Brazilian academic institutions have so far joined the GBB consortium, and together will be responsible for reaching the following main deliverables by the end of 2027: 1) structuring a national network for generating barcode references (i.e., mitogenomes and plastomes) for species identification; 2) sequencing high-quality reference genomes for selected species of the fauna and flora, including those of interest for the Brazilian bioeconomy; 3) resequencing genomes for i) estimates of population structure and genetic diversity for wildlife and captive species of conservation concern; and ii) association of phenotypic and environmental traits of species of interest for bioeconomy; and 4) implement a national monitoring scheme of target species/biological communities using environmental DNA (eDNA). As of July 2024, 69 projects encompassing the scope of activities outlined above have either entered or will soon be developed under the GBB umbrella, with many more joining the consortium soon. New partners are welcomed and needed to scale-up the generation of genomic data for Brazilian native species.
Pontifical Catholic University of Rio Grande do Sul
GenoTropics: Neotropical Adaptive Genomics
The talk will describe the scope, current composition, vision and activities of the GenoTropics consortium (https://www.genotropics.org/), a network of scientists based in Brazil and Germany working on adaptive genomics of Neotropical taxa. The consortium aims to promote the advancement of Biodiversity Genomics in the Neotropics via collaborative research, training workshops and the fostering of cooperation among stakeholders such as universities, governmental agencies and NGOs within and among Latin American countries and also in the context of global scientific networks. The research projects conducted so far have focused on developing genomic resources for Neotropical taxa and applying them to address evolutionary questions focused on phylogenomics, speciation, hybridization, demographic history and population structure, which enable in-depth assessments of adaptive evolution and inform conservation planning on behalf of the target species. The training component has so far included two workshops along with satellite meetings focused on methodological topics and project discussions. The 2023 workshop focused on methods in Biodiversity Genomics and included a final symposium discussing the future of biodiversity genomics in Brazil. The 2024 workshop ('Equalitarian collaborations for the future of Biodiversity Genomics') included 10 sessions covering all aspects of biodiversity genomics, from legal considerations, sampling protocols and banking to bioinformatics, collaborative networks and equitable sharing of genetic resources. A proposed 2-year extension of the consortium funding aims to incorporate additional research groups, expanding the geographic, taxonomic and scientific scope of its collaborative research projects, and consolidating GenoTropics as a catalyst for efforts to understand and conserve Neotropical biodiversity.
The Ohio State University
Towards the high-quality, de novo annotation of transposable elements in eukaryotes
Sequencing technology and assembly algorithms have matured to the point that contiguous de novo assembly is possible for large, repetitive genomes. Numerous methods exist for annotation of varying types of TEs, but their relative performances are often suboptimal. Moreover, diverse TE landscapes of eukaryotic genomes challenge each pipeline to produce high-quality TE annotations. We benchmark existing programs based on carefully curated TE annotations of model species. Using the most robust programs, we create a comprehensive pipeline called Extensive de-novo TE Annotator (EDTA) that produces annotation of structurally intact and fragmented transposing. As an evolving program, we continuously improve existing pipelines and incorporate them into EDTA for more robust and scalable TE annotations in genomes with diverse TE landscapes. The resulting TE annotations have promoted a much more in-depth understanding of the diversity and evolution of TEs at both intra- and inter-species levels.
University of Greifswald
Tiberius: Accurate Ab Initio Gene Finding
We present our new deep learning gene finder, Tiberius, that seamlessly integrates deep learning sequence-to-sequence layers with an HMM layer. All time-consuming calculations are done fully on GPU and in parallel. Currently, Tiberius' only input is a genome, making it an ab initio predictor. We benchmarked Tiberius against Helixer, AUGUSTUS, BRAKER3 and GALBA on three complete mammalian genomes. Tiberius achieves a gene-level F1 score of 52% and outperforms the other ab initio gene finders Helixer (21%) and AUGUSTUS (15%), by a wide margin. It almost reached the accuracy of BRAKER3 (54%), although BRAKER3 was given RNA-Seq data and a protein database as inputs, while Tiberius was given neither. Tiberius even slightly surpassed the accuracy of BRAKER3 when we incorporated an evolutionary signal learned from a multiple alignment of unannotated genomes. It takes just under 2.5 hours to annotate a whole mammalian genome on a GPU. We will present data suggesting that BUSCO completeness is a poor proxy for gene structure accuracy. Currently, Tiberius is only trained on mammals, cannot integrate RNA-Seq data and cannot predict alternative splicing.
Rutgers University
De novo genome assembly for an endangered lemur using portable nanopore sequencing in rural Madagascar
As the most threatened mammalian taxa, lemurs of Madagascar are facing unprecedented anthropogenic pressures. To address conservation imperatives such as this, researchers have increasingly relied on conservation genomics to identify populations of particular concern. However, many of these genomic approaches necessitate high-quality genomes. While the advent of next generation sequencing technologies and the resulting reduction of associated costs have led to the proliferation of genomic data and high-quality reference genomes, global discrepancies in genomic sequencing capabilities often result in biological samples from biodiverse host countries being exported to facilities in the Global North, creating inequalities in access and training within genomic research. Here, we present the first reference genome for the endangered red-fronted lemur (Eulemur rufifrons) from sequencing efforts conducted entirely within the host country using portable Oxford Nanopore sequencing. Using an archived E. rufifrons specimen, we conducted long-read, nanopore sequencing at the Centre ValBio Research Station near Ranomafana National Park, in rural Madagascar. Exclusively using this long-read data, we assembled 2.21 gigabase, 20,330-contig nuclear assembly with an N50 of 98.9 Mb and a 17,108 bp mitogenome. The nuclear assembly had 31x average coverage and was comparable in completeness to other primate reference genomes, with a 95.47% BUSCO completeness score for primate-specific genes. As the first reference genome for E. rufifrons and the only annotated genome available for the speciose Eulemur genus, this resource will prove vital for conservation and evolutionary genomic studies while our efforts exhibit the potential of this protocol to address research inequalities and build genomic capacity.
Università degli studi di Milano
The first two Fusarium verticillioides genomes from human patients: a genomic overview
Fusarium verticillioides (FV) is a prominent plant pathogen that occasionally causes human fusariosis. Here, we present the first genome assemblies of two FV strains isolated from clinical settings: FV_05-0160 from an immunocompromised patient post bone-marrow transplantation, and FV_IUM09_1037 from a patient's blood. The genomes were analyzed on the European GALAXY platform, including read quality checks, filtering, nuclear and mitochondrial DNA assembly, and completeness assessment, followed by ab-initio annotation using Augustus with FV 7600 as a reference with a final functional annotation performed with Omicsbox software. The clinical strain genomes were 1.5-2.3 Mb larger than the reference FV 7600 genome (41.7 Mb) Phylogenetic analysis (ML phylogenetic tree, FastANI alignment, and MASH comparison) showed no host-specific clustering, suggesting that this species can adapt to diverse hosts, potentially allowing transmission from agricultural products to humans. Moreover, phylogenomic positioning confirmed species identity and showed close relatedness to strains from maize in Italy, Australia, and the USA. Comparative genomic analysis identified a unique set of genes in the human strains that were not present in the plant-derived reference strain. These genes may be involved in the pathogen's adaptation to human hosts. This study provides the first evidence, to our knowledge, for genomic differentiation in two humans pathogenic FV fungal genomes and open the way to comparative genomic studies searching for specific genes in host-niche adaptation.
National Center for Biotechnology Information, NLM, NIH
NCBI Datasets: current status and future developments on genomic data delivery at NCBI
NCBI Datasets is a resource from the National Center for Biotechnology Information (NCBI) that facilitates the access to genomic data and metadata through a web interface, API and command-line tools. Since its initial release, NCBI Datasets has been under constant development and adapting to meet the data retrieval needs of the public health and biological research communities. Here, we present some of the most recent updates in the past two years. For the command-line tool, the improvements include changes in the command syntax, a new data endpoint to retrieve taxonomic information, and the consolidation of the ortholog and gene endpoints. On the web, one of the most significant changes is the release of a growing set of orthologs for approximately 2 million insect genes. This set includes gene, transcript and protein sequences that previously could only be retrieved using the command-line tool. In terms of future developments, NCBI Datasets is planning a new gene resource, with higher integration of gene information, such as orthologs, expression data, gene ontology and protein architecture and domains. NCBI Datasets contributes to the NIH Comparative Genome Resource (CGR) mission, which aims to maximize the impact of eukaryotic genomic resources on biological and medical research and is now the primary portal to assembled genome data at NCBI. We will continue improving our tools and data accessibility through an interactive process with our users, welcoming their feedback and engagement.
National Center for Biotechnology Information, NLM, NIH
High-quality eukaryotic genome annotation with NCBI's public EGAPx pipeline
The National Center for Biotechnology Information (NCBI) RefSeq project provides high-quality gene-model annotations for over 1100 eukaryotic organisms generated using NCBI's Eukaryotic Genome Annotation Pipeline (EGAP). To support the rapidly expanding needs of large-scale sequencing projects such as the Earth BioGenome Project, NCBI is now developing a public version of this pipeline known as EGAPx (EGAP external). EGAPx is an evidence-based pipeline using a Nextflow workflow and containerized applications to generate high-quality structural and functional annotation of most metazoan and plant genomes, with outputs suitable for GenBank submission. EGAPx currently utilizes short-read RNA-seq and protein sequences to inform model prediction for protein-coding and lncRNA genes, with IsoSeq and ONT transcript support coming soon. EGAPx is designed for easy configuration and supports execution in the cloud (e.g. AWS) or on multiple types of local HPC platforms. EGAPx annotations exhibit high sensitivity and precision in mock annotations of human and Drosophila genomes. EGAPx is being developed as part of the NIH Comparative Genomics Resource (CGR), which facilitates reliable comparative genomics analyses for all eukaryotic organisms through an NCBI Toolkit and community collaboration. This work was supported by the NCBI of the National Library of Medicine (NLM), National Institutes of Health. The workflow and documentation for EGAPx can be found at https://github.com/ncbi/egapx
University of Oklahoma
Seasonal variation in the serum proteome of the Mexican free-tailed bat (Tadarida brasiliensis)
Bats harbor a diverse assemblage of pathogens, some of which can infect humans. Although bats often host such pathogens without showing clinical disease, this tolerance of infection is hypothesized to be disrupted by intrinsic and extrinsic stressors. Yet most work to date has ignored the immune mechanisms linking these stressors to pathogen infection in bats. Bat immunology is a growing field, such that increasing genomic resources play a key role in helping characterize immune components in downstream omics data such as transcriptomes and proteomes. Here, we assess whether long-distance migration may function as an immunological stressor in bats, focusing on serum proteomes from a population of Mexican free-tailed bats (Tadarida brasiliensis) sampled monthly over a year. Each year, these bats migrate north from their wintering grounds in Mexico during spring to form large maternity colonies in the southwestern United States, birth and raise their pups, and then migrate back south to Mexico. We leverage the recent genome assembly and annotation developed by the Bat1K Initiative to identify over 200 proteins spanning seven orders of magnitude in 100 seasonally collected serum samples, and we use generalized additive models and multivariate methods to test if and how the abundance and composition of the serum proteome varies between spring migratory arrival, summer birth and pup-rearing, the non-reproductive period in late summer, and the pre-migratory period in autumn. Our work emphasizes migration as an immunologically and epidemiologically relevant period for bats and highlights the need for genomic resources in downstream immunological studies of bats.
Arizona State University
Transposable Element Expansions Drive Genome Size Evolution in Squamate Reptiles
The factors controlling genome size variation among eukaryotes remains a key question in evolutionary biology. Variation in genome size does not correlate with organismal complexity (i.e., the 'C-value enigma'), and the evolutionary forces impacting the relationship between size and fundamental genomic features like gene content, non-coding DNA, and base pair composition remain unclear. Squamate reptiles, comprising over 11,000 extant species, provide a unique model to study this phenomenon. Unlike birds, mammals, and other vertebrates, squamates display intermediate, constrained genome sizes coupled with high transposable element (TE) activity and diversity. We investigated the dynamics of genome size evolution in squamates by analyzing the relationships between DNA gain, loss, and TE activity. Using a taxonomically comprehensive dataset of 111 squamate genomes, we performed ancestral state reconstructions and calculated lineage-specific DNA gains and losses at key evolutionary timepoints in squamate evolution. Our findings reveal significant negative correlations between squamate DNA gains and losses, supporting an 'accordion model' of genome size evolution. However, this relationship varies across squamate lineages, indicating distinct patterns in TE dynamics and their effects on genome size between major squamate clades. Our results underscore the utility of squamates as a model of genome evolution studies, suggesting the roles of selective pressures and neutral evolution (i.e., accordion model) likely vary across squamate lineages. By integrating hundreds of publicly available whole genome assemblies with TE annotations and evolutionary history, this study enhances our understanding of genome size regulation and the evolutionary forces shaping genome architecture in vertebrates.
The Rockefeller University, New York
Towards the great Passenger pigeon comeback
The Passenger pigeon (Ectopistes migratorius) was once the most abundant bird species in North America, numbering between 3 to 5 billion individuals before being driven to extinction by human activities in the early 1900s. Thousands of museum specimens exist, while its closest relative, the Band-tailed pigeon (Patagioenas fasciata), is still extant. Thus, the Passenger pigeon is one of the few species amenable for de-extinction, the process of restoring functional alleles in a proxy of an extinct species. The restoration of this species will particularly affect Eastern American woodland forests, which once benefitted from the disturbances generated by Passenger pigeon’s large flocks. Indeed, this is one of the few, if not the only, truly colonial Columbidae species. Two other distinguishing Passenger pigeon traits, sexual dichromatism and graduated tail, were probably the result of its colonial behavior and rarely occur across the family tree. These three traits are the focal subjects of the project, which involves state-of-the-art sequencing technologies and bioinformatic pipelines to determine the genomic blueprint of the Passenger pigeon. We generated VGP-quality genomes for multiple Band-tailed pigeon individuals sampled across the species’ geographic range and assembled a pangenome reference to help map the aDNA Passenger pigeon short read data and mitigate reference biases. This approach will allow us to more confidently call variants between the two species, reconstruct the extinct species' genome, and identify the de-extinction mutations that need to be edited in the genome of the Band-tailed pigeon to restore Passenger pigeon’s phenotypes.
Universidad Católica de la Santísima Concepción
Comparative genome characterization of honeybees (Apis mellifera) with differences in hive performance
Honeybees (Apis mellifera) play a crucial role in global agriculture and ecosystems through their pollination services. However, hive performance, which includes factors such as honey production, brood viability, and resistance to pests and diseases, varies significantly among different bee populations. This study aims to perform a comparative genomic analysis of honeybee populations exhibiting distinct hive performance traits to identify genetic factors underlying these differences. We sequenced the genomes of honeybee samples from high-performing and low-performing hives across multiple regions. Using advanced bioinformatics tools, we identified single nucleotide polymorphisms (SNPs) and structural variations associated with key performance metrics. Our results revealed several candidate genes related to immunity, foraging behavior, and stress resistance that showed significant variation between the high and low-performing groups. Notably, genes within the major histocompatibility complex (MHC) pathways were highlighted as potential contributors to enhanced hive productivity and resilience. Additionally, we observed genomic regions with signatures of selective sweeps, suggesting recent adaptation to environmental pressures and management practices. These findings provide insights into the genetic basis of hive performance and offer potential markers for selective breeding programs aimed at improving honeybee health and productivity. Future research should focus on functional validation of these candidate genes and the development of genomic tools for beekeepers. This study underscores the importance of integrating genomics into apiculture to sustain and enhance honeybee populations amidst global challenges.
Genome Sciences Centre, Vancouver
The Canadian BioGenome Project
Canadian biodiversity is one of our greatest national treasures. From coast to coast to coast, Canada is home to more than 100,000 plant and animal species, in environments ranging from desert to the Arctic. Many of these species are under threat due to rapid changes in climate and other human-led impacts on our environment. The Canadian BioGenome Project seeks to better understand and conserve our natural heritage by sequencing the genomes of 400 Canadian species, ranging from fungi to large mammals. The species we sequence are selected based on existing and established priorities of Indigenous peoples, national and regional organizations, and conservation and wildlife groups. These organizations have a history of (or a strong interest in) using genomic information to develop tools and solutions for the maintenance of biodiversity, monitoring, conservation, restoration and environmental management. The data generated will be easily accessible through a user-friendly geospatial platform of metadata and genomics data. As of June 2024, samples for 260 species in Canada have been collected. Sequencing data has been generated for 107 species (with an additional 43 species in the sequencing queue), and extractions and sequencing for 85 species are planned. Sixty-seven assemblies have been submitted to NCBI (PRJNA813333), thirteen of which are annotated using Ensembl Rapid Release.
Université d'Abobo-Adjamé
Limitations of the Cytochrome oxidase I marker in DNA barcoding and identification of marine fish from Côte d'Ivoire
Actinopterygians are the most diverse group of vertebrates, but also of West African marine environments. With overfishing, water pollution and climate change having a negative impact on ecosystems, and given their ecological and economic importance, it is essential to improve our knowledge of the species found along the coasts of Côte d'Ivoire with a view to resource management. However, the morphological approach to identification is time-consuming and requires sound taxonomic expertise. In response to this approach, a molecular examination was based on the COI mitochondrial gene for species identification. This technique has been widely used to identify marine fish, but very little in West Africa. As part of a major study underway on the Atlantic coast of Côte d'Ivoire to identify marine species and set up a local reference database, we analysed the sequences of several actinopterygian specimens and deposited them in BOLD. It was shown that the COI marker, although important, could not discriminate between plant or arthropod species. In this sample, the results showed that for seven species, the COI marker was unable to separate species of the genera Pagellus, Pomadasys, Sarda, Scomber, Selene, Trachurus and Umbrina. The taxa involved were morphologically similar or closely related, with intra-group divergence ranging from 0.00% to 5.43%, and an average of 1.18%. Ultimately, these data confirm that the COI barcode is an important tool for identifying fish, but often failed to discriminate between species in morphologically indistinguishable groups, leading to the use of other markers or integrative taxonomy.
Stellenbosch University
Seagrass transcriptomic responses to heat stress
Gaining insights into the transcriptional and photophysiological responses of climatic extremes, such as marine heatwaves (MHWs), is crucial to understanding the responses of ecologically important foundational species. Species distributed along an environmental cline, such as the endangered seagrass Zostera capensis, provide an opportunity to assess key functional gene expression and photophysiological responses to temperature effects between populations. Here we exposed two genomically divergent Z. capensis populations from contrasting thermal niches within the same system to a simulated MHW (34 °C for three days) in a common-stress garden approach. The population locally adapted to greater thermal stochasticity showed pre-adapted phenotypic variation in response to acute warming through activation of heat-responsive genes and molecular chaperones. Both populations showed the activation of genes involved in thermal resilience including higher photosynthetic stability and respiratory acclimation. We conclude that the different intraspecific adaptive responses exhibited in gene-expression patterns during recovery provides critical information on thermal adaptation in aquatic habitats under climatic stress. In this study we identify transcriptomic mechanisms that may facilitate intrapopulation differential resilience of Z. capensis to anomalous warming events, and propose transcriptomics as an important tool to predict the tolerance of local populations to thermal stress in the face of global climatic change.
University of Johannesburg
Selection drives phenotypic divergence of Cape leopards
Genetic divergence between populations may occur through selective drivers mediated by environments or spatial niches, which can cause phenotypic variability within species, such as variation in coat colour and body size. The adaptive value of such intraspecific phenotypic variability is still poorly understood. Here, we generated whole genome sequencing data of morphologically distinct Cape leopards (Panthera pardus), which are almost half the size of leopards elsewhere in southern Africa. We investigated their population demographic history and questioned whether there are signatures of selection that drive genomic and phenotypic differences. Population structure analyses revealed a clear distinction between leopards from the Cape and northern South Africa, which diverged approximately 24,000 years ago (kya), during a cold climatic period with extended drought. In congruence with other animals, desert biomes typically cause reduced body size due to resource limitations. We found genes under selection that show a habitat-mediated response to environmental stresses, relating to fat rationing, calcium influx, and zinc deficiency. Therefore, the enriched genes we found in Cape leopards may be in response to historical food scarcity in the area, driving basal metabolic rates to reduce body mass. Considering the local adaptation and deep divergence found in both mitochondrial and nuclear genomes, Cape leopards can be considered a unique evolutionary unit. In other species, lineages that diverged 10-30 kya and are associated with specific ecotypes are noted as different subspecies, prompting the urgency to sustain the future viability of this genomically unique umbrella species that is currently at risk of extinction.
Laboratory of Water, Biodiversity and Climate Change
Updates to the checklist of Macroheterocera (Lepidoptera) in the central High Atlas Mountains of Morocco
Given the complexity and relatively lesser known status of Afrian moths, this study leverages modern genomic techniques to explore and catalog the Macroheterocera diversity in Morocco. The primary objective was to provide an update to the knowledge of nocturnal Lepidoptera species in the High Atlas Mountains using a combination of traditional taxonomy and DNA barcoding, contributing to the broader understanding of biodiversity in this region. Sampling was conducted using sugar bait trapping across nine localities in the High Atlas Mountains. Specimens were identified through an integrative approach combining external morphology, anatomical studies, and DNA barcoding of the mitochondrial COI gene. A total of 4,561 specimens were collected, representing 125 species of Macroheterocera. Anatomical identification, supplemented by external morphology or DNA barcoding, was used for most specimens. DNA barcoding confirmed the identity of 112 specimens, although 9 specimens could only be identified to the genus level and remained taxonomically unresolved. This study highlights the utility of DNA barcoding in complementing traditional morphological methods to enhance species identification and biodiversity assessment. The inability to resolve certain specimens to the species level underscores the need for further genomic and taxonomic studies. Our findings contribute to the foundational knowledge necessary for conservation and biodiversity management in the High Atlas Mountains. Keywords: Lepidoptera, Macroheterocera, High Atlas Mountains, Morocco, sugar bait trapping, DNA barcoding, COI gene, species identification, conservation
Ibn Zor University, Agadir
Applying genomics to support conservation assessments of Moroccan biodiversity: the Saharan honeybee (Apis mellifera sahariensis) as a case study
In Morocco, there are two well-recognised honey bee subspecies: A. m. intermissa in the north and A. m. sahariensis in the south-east. Honey bee diversity of the latter subspecies is under threat due to anthropogenic factors such as transhumance and the introduction of allochthonous subspecies. To assess the genetic diversity and population structure of the Saharan honey bee, we used a set of 12 microsatellite loci and analyzed 148 colonies which were clustered into seven populations representing the expected distribution of A. m. intermissa and A. m. sahariensis, and reference samples from two European subspecies, A. m. carnica and A. m. mellifera. The average number of alleles per locus in the sampled populations ranged from 2.417 (A. m. carnica) to 10 (East Sahara). Per-locus expected heterozygosity (He) fluctuated between 0.820 ± 0.028 in the High Atlas population and 0.404 ± 0.072 in A. m. carnica. Bayesian clustering analyses as revealed by structure analysis suggests two distinct clusters in Morocco separated by the High Atlas Mountains (FST = 0.05). Even though high levels of admixture in honey bees with A. m. intermissa jeopardize the genetic integrity of the Saharan honey bee from the Saharan region, no evidence of introgression was detected from the European reference subspecies. Results of this study reveals that beekeeping practices have strongly influenced the genetic structure and diversity of honey bees from southeastern Morocco, and that introductions of non-native subspecies represent a serious threat to the genetic integrity of native honey bee populations.
Tshwane University of Technology, Pretoria
Recombination hotspot landscape in South African beef cattle
Recombination shapes the trajectory of genetic variability in the genomes of eukaryotes. Despite the plethora of studies investigating the nature of recombination in cattle, knowledge of recombination occurrence in South African cattle breeds remains limited. This study aimed to identify patterns of recombination hotspots between two South African beef breeds. Hair samples from a total of 309 Bonsmara (n=190) and Nguni (n=119) cattle were genotyped using the Illumina Bovine 50K SNP Bead chip, at the Agricultural Research Council, Biotechnology Platform (ARC-BTP). Haplotype phasing was performed per chromosome across 29 autosomes in both breeds using SHAPEIT v2.r904. Crossover probabilities and recombination events within 0.5 Mb windows were estimated within half-sib families using the DuoHMM package in SHAPEIT. On average, 0.31 and 0.18 recombination events were detected for Bonsmara and Nguni respectively. Recombination hotspots were defined as SNP intervals with recombination rates >2.5 standard deviations. Non-overlapping windows containing recombination hotspots amounted to 407 for Bonsmara and 179 for Nguni. Moreover, Nguni recombination hotspots were evenly distributed across the chromosomes while hotspots observed in Bonsmara were located towards the start and end regions of the chromosomes. These results indicate new insights into the nature of patterns of recombination underlying South African beef breeds. Furthermore, the study indicates that meiotic recombination patterns in African beef breeds are non-uniform and may also potentially yield agriculturally and economically important genes. The non-uniformity of recombination hotspots observed in Bonsmara and Nguni supports the phenotypic variation that exists between these beef breeds.
DIPLOMICS, Cape Town
1KSA - Decoding South Africa’s Biodiversity
South Africa is one of the most biodiverse countries in the world with many institutions researching and documenting local biodiversity. However, South African scientists often conduct genetic research overseas to take advantage of competitive prices internationally. This contributes to a drain of skills, data, knowledge and opportunity out of South Africa. To mitigate this brain drain and to build capacity for designing and carrying out biodiversity genomics experiments in country, a South African Biodiversity genomics program called 1KSA (www.1kSA.org.za) was launched, in 2023, by DIPLOMICS, a Genomics, Proteomics, Metabolomics and Bioinformatics Research Infrastructure program based in South Africa. Following a successful species nomination application, South African sample contributors submit DNA for species of interest to one of several 1KSA partner labs. Whole genome sequencing takes place using Oxford Nanopore Technology PromethION devices and a draft genome assembly is generated using the 1KSA pipeline on the Centre for High Performance Computing. Data are stored using the Data Intensive Research Initiative of South Africa; are made known via the generation of species information cards on the 1KSA website; and access can be requested through the 1KSA Data Access Committee. Sequenced genomes of biodiversity and economically important species are tools for population genomics studies, conservation, management of the impacts of climate change, and identification of novel compounds with spin off potential for development of the bioeconomy. An overview of the 1KSA project will be presented, with short updates from the 1KSA partner labs specialising in sequencing plants, fish, mammals, bacteria, fungi and insects.
Faculty of Sciences, Mohammed V University in Rabat
Structural Genomics of Bats Endocannabinoid System
The endocannabinoid system is a regulatory network implicated in numerous physiological functions, including synaptic transmission, feeding, sleep, wakefulness, and immunity. While extensive efforts have been made to study the endocannabinoid system in mammals and other animals, bats remain largely unstudied in this context. Analyzing the genetic components of the endocannabinoid system in bats has the potential to unlock mysteries behind their unique physiological traits, such as slowed aging, enhanced metabolism, and flight capability. The analysis of bats' endocannabinoid genes could provide valuable insights into the biological mechanisms underlying these extraordinary abilities, contributing to our broader understanding of physiology and adaptation. This study aims to conduct an in-depth analysis of the endocannabinoid genes in bats—specifically CB1, CB2, FAAH, DAGL, and MAGL—using a comparative genomic approach. The sequences for the CB1, CB2, FAAH, DAGL, and MAGL genes were downloaded from the NCBI databases for Myotis myotis, Desmodus rotundus, Molossus molossus, Rhinolophus ferrumequinum, Pipistrellus kuhlii, Phyllostomus hastatus, Mus musculus, and Homo sapiens. Alignment and phylogenetic analyses were performed using the MEGA software. The TBtools toolkit was used to analyze gene structures and motifs, and to extract introns and exons. Proteins were modeled using homology modeling. The binding affinities for naturally occurring endocannabinoids were then assessed using AutoDock and SwissDock. All the data were compared between bats, humans, and mice.
Institut de Biologia Evolutiva (upf-csic)
The genome of Singekia montserratensis, an apusomonad heterotrophic flagellate with a pivotal evolutionary position
Less than 1% of sequenced eukaryotic genomes are from free-living protists. It is the case of apusomonads, heterotrophic biflagellates that graze bacteria on surfaces in marine and freshwater environments. They are in a crucial part of the eukaryotic tree that has not been well sampled. Apusomonads are the sister lineage to Opisthokonts; the group comprising multicellular animals and fungi. Moreover, Apusomonads are ubiquitously found, but are systematically scarce, hence its ecological importance is unknown. Only a single apusomonad genome is available (from the > 30 distinct genera). In order to reconstruct ancestral genomic characteristics at distinct moments of eukaryote evolution, from the last common ancestor of eukaryotes to the opisthokonts and their intermediate lineages, so we need to cover more diversity. The species of interest is Singekia montserratensis is a freshwater apusomonad isolated from a brook near the Montserrat mountain. The genome of S. montserratensis was obtained by extracting the DNA from three different conditions in a simplified metagenomics context (bacterial food present in culture). The assembly was performed using a bioinformatic pipeline starting with hybrid metaSpades, cleaning the metagenome assembled genome using a combination of; 1) unsupervised methods: Emergent Self-Organizing Map and Metabat2; and 2) supervised methods: Tiara & predicted peptides from RNAseq. From the draft metagenome assembled genome, proteins were annotated with Braker. At different points of the pipeline, BUSCO, QUAST and RNAseq were used to obtain completeness and contiguity statistics.
Instituto de Ciències del Mar, CSIC
Genomes of uncultured protists: needed, feasible, and useful
Many species of microbial eukaryotes, specially the smallest ones, belong to novel lineages that refuse cultivation. Getting the genomes of this unknown and uncultured diversity is essential to fully understand eukaryotic evolution and ecological functions. This effort is still in its infancy, and today there are two main approaches, single cell genomics and metagenomics. I will present data towards this direction using freshly collected samples from the Blanes Bay Microbial Observatory at the Northwestern Mediterranean Sea. We first produced a collection of about 300 partial genomes from single cells sorted as pigmented or colorless protists (2-5 µm in size). These corresponded to species mostly within Prymnesiophyceae, Mamiellophyceae, Chlorarachnea, Chrysophyceae, Choanoflagellata and MAST clades. Single cell genomes contained about one third of the genomic data in each species, and the coverage was improved by coassembling conspecific cells. Second, we sampled for metagenomics the same protist community that evolved during a few days in a dark incubation. This produced 25 MAGs (Metagenome Assembled Genomes) from similar groups as before and with a considerable recovery (up to 80% of genes predicted). We searched for CAZy genes to evaluate the functional potential of these genomes and to infer possible gene patterns across high-rank taxonomic groups. Moreover, these novel genomes are helping to interpret metagenomic data. Sequencing the genomes of the uncultured improves our knowledge of microbial biodiversity and functions in the ocean and contribute to a better understanding of genome evolution across the eukaryotic tree of life.
Wellcome Sanger Institute
Predictions of protein complexes using protist genomes
Protein complexes are responsible for essential biological functions and thus conserved throughout all domains of life. Our understanding of protein complexes is biased towards model species and genomically well-sampled clades. Although initiatives to diversify the taxonomic spread of available genomes are ongoing, discovery, analysis and prediction of protein complexes throughout eukaryotes is hindered by bias in selection of species for sequencing, which are typically of animal, fungi or plant origin. This bias critically misses the huge phylogenetic diversity present in the 'protists'. To address the underrepresentation of protists and facilitate prediction of protein complexes across all eukaryotic phyla, we predict protein complex profiles (complexomes) for eukaryotes from genomic and/or predicted proteome records, using the EBI Protein Complex Portal database (https://www.ebi.ac.uk/complexportal/home) as a query dataset for homology searches. This long-term project will be key to deciphering the protein complexome of extinct eukaryotic ancestors, particularly the last eukaryotic common ancestor (LECA). As more genomic data becomes available, the predictive capacity of our software will improve. Our database and associated tools will be open and key resources for predictions of protein function in eukaryotic genomes.
Wellcome Sanger Institute
Microsporidian Genomes: tales of sexual reproduction, polyploidy, and rearrangements
Microsporidia are single-celled, spore-forming, obligately intracellular parasites with tremendous public health and economic importance, which infect a huge range of metazoan and protozoan hosts. Since their discovery in the late 1800s as agents of disease in farmed silkworms, microsporidia have been identified as parasites of humans, and as causative agents of beehive collapses. Recently however, two microsporidian species were shown to be associated with a reduction in Plasmodium transmission in infected Anopheles, suggesting that microsporidia may represent a potential route to malaria control. Despite this, few microsporidian species have had their whole genomes sequenced, and much of the biology and pathogenicity of these parasites remains mysterious. Using the data produced by the Darwin Tree of Life project, I generated 30 high quality, long-read microsporidian genomes, from single hosts (as opposed to pooled infected individuals) - including two chromosomal assemblies, and the first HiC data ever generated for a microsporidian parasites. I supplement those with another set of genomes generated by collecting 600 insects in the UK, screening them for microsporidian infections, and commissioning infected individuals for long-read and HiC sequencing. In the biggest study of its kind on microsporidian genomics, we show for the first time that microsporidia undergo sexual reproduction, infer the group’s highly contested phylogenetic position and relationship with fungi, reconstruct its ancestral linkage groups, and show that polyploidy and extreme structural rearrangements (both between species, and within the subgenomes of an individual) are widespread in Microsporidia.
Department of Biology, University of Florence
Annotation of eukaryotic genomes in Galaxy
Recent advancements in genome sequencing and assembly promise to produce high-quality reference genomes for many species. However, the genome assembly process must be followed by accurate genome annotation. Nevertheless, predicting genes in large eukaryotic genomes remains a significant challenge, necessitating the development of new algorithms and streamlined pipelines. To democratize the training and annotation process, we are introducing the first version of the Vertebrate Genomes Project (VGP) annotation workflow in Galaxy, leveraging pre-existing pipelines, particularly the NCBI Eukaryotic Genome Annotation Pipeline (EGAPx) that has been used to annotate VGP genomes so far. Currently, EGAPx utilizes already existing tools such as miniprot and STAR for aligning protein and RNA-seq data, respectively. These alignments are then used by Gnomon for gene prediction, with potential gene models refined through ab-initio predictions based on HMM models. The VGP annotation workflow incorporates additional quality control measures throughout the process, including BUSCO analysis on the identified gene models. The newly developed pipeline can handle heterogeneous data given the significant variation in evidence from transcriptomes and proteomes across taxonomic lineages. The pipeline was implemented in Galaxy to enhance coordination between assembly and annotation processes. This workflow will be used to annotate eukaryotic genomes by integrating various data types and tools to produce high-quality annotations of protein-coding genes and other genomic features, including non-coding regions. The VGP aims to use these methods to establish thorough, accurate reference genomes for about 70,000 vertebrate species.
Wellcome Sanger Institute
Genome After-Party a repository of ready-made genome analysis
The Tree of Life department of the Wellcome Sanger Institute is largely devoted to generating and analysing high-quality reference genomes for large-scale biodiversity projects. Our flagship project is Darwin Tree of Life (DToL), for which we have generated close to 1,500 reference genome assemblies. All those assemblies are then released to INSDC without embargo and a short paper, called a Genome Note, is published at Wellcome Open Research. Genome Notes require certain genome analyses, such as quality scores (QV, BUSCO) or plots (Hi-C contact maps, BlobTools). These represent the core of the Genome After-Party, a new public data repository (https://gap.cog.sanger.ac.uk/) for common genome analyses, that we are now expanding with sequence-composition tracks, variant-calls, etc. In this talk, I will first present the structure of the Genome After-Party data repository and the pipelines used to generate those data, as well as our future plans. Crucially, the Genome After-Party is public and is built to facilitate reusing genomics analyses. With each genome requiring thousands of CPU hours of compute, the repository will allow saving hundreds of tons of CO2-equivalent, contributing to the sustainability of the Earth Biogenome Project.
Wellcome Sanger Institute
GoaTs, BoaTs, Molluscs & Lepidoptera
The GenomeHubs codebase has allowed us to develop GoaT (Genomes on a Tree) and BoaT (BUSCOs on a Tree) to query and explore genomic data at the scale of the Earth BioGenome Project. We are now extending this approach to our taxon-oriented databases to produce more readily scalable versions of MolluscDB and LepBase. These updated sites provide a platform for exploring genomic features, orthology and ancestral linkage groups backed by data stored on S3 to align with the Genome After Party project.
Department of Anatomy, University of Otago, Dunedin, New Zealand
The NZ Spotty wrasse as a model for sexual fate genomics
The New Zealand Spotty wrasse is found ubiquitously throughout the Aotearoa coastline and possesses the remarkable ability to change sex from a female to male (protogynous sequential hermaphroditism). This process is triggered by changes in the social hierarchy; when the dominant male is removed from the social group, the most dominant female in the group will change sex to male. While complete sex change requires approximately 60 days for spotty to transition, we show that disrupting the social hierarchy immediately and rapidly initiates (within 1 hour) a substantial change in dominance behaviour in the next in line female, and that this is accompanied by increased neural activation in the social decision-making network of the brain. The removal of the dominant fish as a social trigger of sex change is fairly well established, however, the changes in gene expression which initiate sex change remains to be fully elucidated. We present gonad transcriptomic data spanning the full sex and transitional states of spotties showing that there are specific gene signatures associated with each transitional state, and that masculinising genes increase, while feminising genes simultaneously decrease, throughout the transition. Further, as spotties have both primary and secondary males, we find that there are key gonad expression differences in both male phenotypes, despite both possessing fully functional male gonads. These remarkable mechanisms highlight the plasticity of the sex phenotype in spotties and together position the spotty as an excellent model for future research of sexual fate determination and its exact genomic triggers.
Wellcome Sanger Institute - University of Cambridge
Sex Chromosome Evolution in Coleoptera
Coleoptera is widely recognised as one of the most diverse and species-rich taxa among insects and animals. Having emerged from a super-radiation in deep evolutionary time, beetles present a unique opportunity to unravel the macroevolutionary processes underlying this remarkable diversity. Surprisingly, despite their immense species richness, relatively few studies have delved into the chromosomal scale of coleopteran genomes. Here, we conducted an analysis of 149 chromosomally complete coleopteran genomes to explore the biology and dynamics of sex chromosome evolution by first reconstructing their ancestral linkage groups and then assessing chromosomal rearrangement events across major lineages of coleopteran taxa. Using 2124 BUSCO genes, we identified eight ancestral linkage groups, previously termed Stevens elements, which differ from the currently published information. Most polyphagan species have XY (or Xyp) sex chromosomes with recurrent independent loss of the Y chromosome across the phylogeny, while the XO system is more common in the adephagan lineage. The identity of these elements is mostly preserved with little inter-chromosomal fusion and fission, except in Curculionoidea and Chrysomeloidea. Nevertheless, the ancestral X chromosome is maintained in almost all beetle species in the dataset, with several independent additions of autosomes to the ancestral sex chromosome in a few species. However, despite the relatively conserved chromosomal identity of the X chromosome, the gene order is mostly shuffled even between species belonging to different genera within the same family, indicating pervasive intra-chromosomal rearrangements in beetles. Our findings thus highlight that, at the molecular level, chromosomal evolution in beetles is mostly driven by intra-chromosomal rearrangements.
The Field Museum, Chicago
Rethinking asexuality: sexual genes in Lepraria lichens
Given its high costs, the ubiquity of sex across eukaryotes strongly suggests it is evolutionarily advantageous. Asexual lineages avoid the risks and energetic costs of recombination but suffer short-term reductions in adaptive potential and long-term genomic damage. Despite these costs, lichenized fungi have frequently evolved asexuality, likely to retain symbiotic algae across generations. The genus Lepraria is thought to be exclusively asexual, while its sister genus Stereocaulon completes a sexual cycle; thus, comparison should shed light on the evolution and long-term maintenance of asexuality. In this study, we assembled and annotated representative long-read genomes from Lepraria and Stereocaulon and added short-read assemblies for an additional 22 individuals across both genera. Comparative genomic analyses revealed that both genera were heterothallic, each with intact mating-type loci of both idiomorphs. Additionally, we assessed 29 meiosis and mitosis genes and 45 genes contributing to formation of sexual reproductive structures (ascomata). All genes were present and appeared functional in nearly all Lepraria, and we failed to identify a general pattern of relaxed selection on these genes across the Lepraria lineage. Together, these results suggest that Lepraria may be capable of sexual reproduction, including mate recognition, meiosis, and production of ascomata. Nevertheless, observations over 200 years have produced no evidence of sexual reproduction in Lepraria. We suggest that instead they may have evolved a form of parasexual reproduction, perhaps by repurposing MAT and meiosis-specific genes, which may avert long-term consequences of asexuality while maintaining the benefit of an unbroken bond with their algal symbionts.
Barcelona Botanical Institute (CSIC-CMCNB), Barcelona
TEs contribute to Anopheles urban adaptation
Anopheles gambiae and An. coluzzii mosquitoes are major human malaria vectors in Africa, accounting for most of the transmission. While urban environments were until recently considered to be unfit for Anopheles larvae development, these mosquito species have rapidly adapted to polluted habitats, posing challenges for malaria control. Therefore, understanding the genetic factors driving this adaptability is crucial. In this work, we have analyzed 375 An. gambiae and An. coluzzii WGS samples from urban and rural areas in six Central African countries. Taking advantage of recent high-quality long-read assemblies for both species, our analysis focused on identifying genetic variants, from SNPs to transposable elements (TEs). We have created the first manually curated TE library for An. gambiae, containing 295 consensus sequences and including 53 new TE families. By combining three TE annotation tools, PoPoolation2, TEMP2 and TEFLoN, we identified 5,462 and 4,773 euchromatic TE insertions present at high frequencies in An. gambiae and An. coluzzii populations, respectively, that could be potentially involved in adaptation. By performing genome-wide selection scans and genome-environment association analyses we identified known and new candidate genes associated with urban adaptation, with half of these genes linked to TEs present at high frequencies. Gene ontology enrichment of these genes revealed functions related to stabilization of membrane potential in An. coluzzii and communication and signaling pathways in An. gambiae. Overall, our work shows that beyond SNPs, TEs could play a significant role in the adaptation of Anopheles species to urban environments.
Senckenberg Research Institute, and Center for Translational Biodiversity Genomics, Frankfurt, Germany
High-quality Tuber genome reveals dynamics of LTR retrotransposons expansion in true truffles
Ascomycetes of the genus Tuber, known as true truffles, are symbiotic filamentous fungi characterized by the formation of hypogeous fruiting bodies and a distinctive flavors, making them food delicacies known for centuries. Due to their economic importance, multiple truffle genomes have been sequenced, revealing peculiar genomic features such as as high proportion of transposable elements (TEs) and low gene redundancy. However, these studies are affected by the high fragmentation of the assemblies and the lack of a solid timeframe for truffle diversification. Here, we address these gaps through a high-quality assembly of Tuber panzhihuanense, along with the first genome-scale fossil-calibrated time tree for Pezizales. With a high contig N50 of 7 Mb, we explored previously hidden genomic features of truffles, such as rDNA cluster organization, TEs genomic distribution, and syntenic relationships with their sister clade Morchellaceae. We revealed that Tuber diversification occurred 50 million years ago during the Cenozoic, concurrent with the rise and expansion of angiosperms and the major diversification of mammals. Following the diversification of Tuberaceae, different Gypsy LTR-retrotransposons independently expanded across various truffle lineages, leading to a commonly dominant but highly diverse Gypsy content. We provide evidence that these amplifications may have affected syntenic relationships with Morchellaceae, as well as gene family evolution, potentially increasing gene-family turnover rates in the clade. This study highlights the concurrent expansion of extant truffles within the context of the angiosperm terrestrial revolution and provides novel insights into the tempo and mode of TEs evolution in an economically important fungal clade.
Department of Genetics - University of Cambridge
Identifying transposable elements in pangenomes
New high quality genomes, together with faster whole genome alignment methods, have opened the possibility of identifying new transposable element (TE) families by their polymorphic character in different haplotypes, in contrast to previous methods based on repetitiveness, homology and structural features. We have developed a tool, pantera, to obtain transposable element libraries from pangenomes of different haplotypes. In this talk we will briefly explain how pantera works and how we have used it to uncover the large diversity of TEs in the radiation of cichlids of Lake Malawi, where they show different patterns of activity in different populations. We also used pantera in 404 species of Lepidoptera from the Darwin Tree of Life where we found new Maverick families in species where they had not been previously reported, and doubled the number of total Maverick elements so far included in Repbase. We will show how they can be organized in three main monophyletic groups, mainly segregated by the internal organization of the Maverick open reading frames. These findings highlight the fact that the diversity of TEs might have been underrepresented in some cases, depending on the quality of the available genomes, and also by the difficulty of finding new ones when they are low copy number and with low or no homology to previously known families.
University of Copenhagen
Genome size evolution in syngnathiform fishes
While significant variations in genome sizes have been shown in vertebrates, especially in fishes, the underlying reasons for these differences remain largely unknown. As the first step, characterizing genome size variation across taxa allows us to understand how quickly genomes can change through evolution. Within the order of Syngnathiformes, which include seahorses, pipefishes, and their relatives across ten fish families, first available genome assemblies suggest significant variation in genome size, but they represent only a fraction of the known diversity. As Syngnathiformes are distributed globally and are diverse in their ecology, life-history and morphology, they present an interesting study system to better understand genome size evolution. By using a dense sampling approach and sequencing short-read based draft genomes for over 250 species of these charismatic fishes, we study genome size evolution in detail for this order. Our dataset reveals enormous differences in genome size, varying by a factor of six. We identify multiple lineages that appear to have expanded or reduced genome sizes. These differences are to a large extent caused by the transposable element (TE) content. Genomes of species differ in abundance of certain TE classes and in their overall TE diversity. Furthermore, we test for correlations between genome size and ecological and life-history traits to gain insight into the potential drivers of these genomic differences. Our findings provide new insights into the evolutionary dynamics of genome size variation in Syngnathiformes and contribute to the broader understanding of genome size evolution in vertebrates.
National Institute of Genetics, Japan
Shark and ray genome sequencing reveals evolutionary trends of vertebrate karyotypic organization
Genomic studies of vertebrate chromosome evolution have long been hindered by the paucity of reliable whole genome sequences of some key taxa. One of those limiting taxa has been the elasmobranchs (sharks and rays), which harbour species often with numerous chromosomes in large genomes. To overcome the limitation, the Squalomix consortium has introduced original laboratory protocols, such as iconHi-C for economical Hi-C and sQuantGenome for cell-free genome size quantification, tested various existing in silico solutions for genome scaffolding and repeat identification, and conducted multifaceted comparative analyses using chromosome-scale DNA sequences. These efforts have allowed the identification of shark sex chromosomes and Hox C genes of sharks and rays that were previously thought to be missing. We also showed various mechanisms of photoreception in sharks that enabled deep-sea lifestyles. Regarding karyotypic evolution, shark genomes exhibit a gradualism of chromosome length with remarkable length-dependent characteristics—shorter chromosomes tend to have higher GC-content, gene density, synonymous substitution rate, and simple tandem repeat content as well as smaller gene length and lower interspersed repeat content. We challenge the traditional binary classification of karyotypes with and without so-called microchromosomes. Even without microchromosomes, the length-dependent characteristics persist widely in some other vertebrate lineages. Our investigation of elasmobranch karyotypes underpins their unique characteristics and provides clues for understanding how vertebrate karyotypes accommodate intragenomic heterogeneity to realize a complex readout.
University of Western Australia
Reconstructing tiger shark history using genomics
Sharks and rays are a clade of high evolutionary, ecological, economic, and cultural significance, and yet they are one of the most threatened taxa groups in the marine environment. Despite this, there remains a lack of molecular resources for this class to assist with their conservation. The tiger shark (Galeocerdo cuvier) is a near threatened, keystone species distributed circumglobally that is under substantial pressure from human impacts, making it a high priority for management worldwide. We sequenced and characterised a reference quality assembly for the tiger shark, the first genome for this family, and used this to dive deeper into the evolution, adaption, and demographic history of this ancient species. We investigated how its effective population size (Ne), genome-wide heterozygosity and inbreeding has changed over time to infer how this species has responded to past global events, and hence potential responses to ongoing and future accumulating threats. This aims to assist in effective management of this high-profile species.
School of Biological Sciences, University of Auckland
Lifetime fitness is correlated more strongly with structural variant than SNP mutational load in a threatened bird species
Conservation genomics is becoming increasingly interested in whether structural variant (SV) information can help the management of threatened species. The functional consequences of SVs are more complex than for single nucleotide polymorphisms (SNPs) and thus may be more likely to contribute to load. While the impacts of SV-specific genetic load may be less consequential for large populations, the interplay between weakened selection and stochastic processes mean that smaller populations, like those of the threatened Aotearoa hihi/New Zealand stitchbird (Notiomystis cincta), may harbour a high SV load. Hihi were once confined to a single remnant population, but have been reestablished into six sanctuaries and reserves, often via secondary bottlenecks, resulting in low genetic diversity, low adaptive potential and inbreeding depression. In this study, we use whole genome resequencing of 30 individuals from the Tiritiri Matangi population to identify the nature and distribution of both SNPs and SVs within this small avian population. We find that SNP and SV individual mutation load is only moderately correlated, likely because SVs arise in regions of high recombination and reduced evolutionary conservation. Finally, we leverage a long-term monitoring dataset of pedigree and fitness data to assess the impact of SNP and SV mutation load on individual fitness, and demonstrate that SV load correlates more strongly than SNP load with lifetime fitness. The results of this study indicate that only examining SNPs neglects important aspects of intraspecific variation, and that studying SVs has direct implications for linking genetic diversity and genetic health to inform management decisions.
Monash University Malaysia
Reference genomes for the endangered hedgehog (Hippocampus spinosissimus) and three-spotted (Hippocampus trimaculatus) seahorses.
Seahorse populations are declining worldwide and currently, many species are listed as endangered. Malaysia is fortunate to host 12 out of the ~50 currently known seahorse species. The International Union for Conservation of Nature (IUCN) Red List of Threatened Species classifies seven of the 12 species as being 'data deficient' and the remaining five as 'vulnerable'. Seahorses are also listed as endangered under the Appendix II of the Convention of International Trade in Endangered Species of Wild Fauna and Flora (CITES). The high demand for dried seahorses for use in traditional Chinese medicines has resulted in overfishing and exploitation of these iconic animals and an alarming decrease in their population. We generated high-quality draft reference whole genomes for the Hippocampus spinosissimus and Hippocampus trimaculatus seahorses using Hi-Fi PacBio Sequel IIe sequencing. The final genome assembly for the H. spinosissimus and H. trimaculatus was 405 and 385 Mb, respectively, with contig N50s of 464 and 293 kilobases, respectively. The quality of the assembled genome was assessed using Benchmarking Universal Single-Copy Orthologs (BUSCO) with a prediction of 95% of the known vertebrate genes for H. spinosissimus and 93.1% for H. trimaculatus. The high-quality reference genomes for H. spinosissimus and H. trimaculatus expand the publicly available genomic resources for this species and provide essential foundations for future research in comparative, population and conservation genomics and will promote the management of seahorse conservation to regulate or stop the trading and marketing of these endangered species in Malaysia and the Association of Southeast Asian Nations.
The University of Western Australia
Improving Hifiasm assemblies with 20 kb ONT reads
The quality and quantity of genome assembly has improved dramatically over recent years. Many large-scale genome projects assemble HiFi and HiC reads using Hifiasm to produce contiguous phased assemblies, scaffolded to chromosome-level. Nevertheless, HiFi reads are typically under 25 kb and can still struggle to assemble long, low-diversity repeat regions. Obtaining 'ultra-long' (100 kb or longer) ONT reads to solve this problem remains a significant challenge due to technical constraints and DNA sample requirements. Here, we explore the utility of using standard ONT long reads (20 kb or more) as 'ultra-long' input to improve phased Hifiasm assemblies for 22 species of bony fish (Genome Size, 627 Mb - 1.54 Gb). We also explore whether the new 'telo-m' mode in Hifiasm v0.9.0 improves telomere prediction in these species. Incorporating 20+ kb ONT reads (7.8X - 93.5X) significantly increased assembly contiguity. BUSCO completeness was not significantly altered, although there was some re-partitioning of BUSCO genes between phased haplotypes for some species. Improvement did not strongly correlate with read depth (neither HiFi nor ONT), suggesting that the underlying read length distributions and/or specific genome features are more important for determining the outcome. Hifiasm 'telo-m' mode significantly increased telomere recovery, assembling over six times the number of gapless telomere-to-telomere chromosomes when combined with incorporation of ONT reads. Verification of how these results translate to the quality and/or ease of curation of final HiC-scaffolded chromosome-level assemblies is ongoing, with a goal to determine whether the additional sample preparation and sequencing in the lab is cost-effective.
Minderoo OceanOmics Centre at the University of Western Australia
High-Quality Genomes for Australian Lutjanidae Species
Lutjanidae (snappers) are highly valued in commercial and recreational fisheries worldwide and some species serve as fisheries indicator species particularly for bioregions in Western Australia. Comprehensive genomic mapping of immune gene families of Lutjanidae species are lacking, but this information can inform understanding disease vulnerability, the impact of environmental stress, improving aquaculture efforts and to provide insights into the health of wild populations. Despite their importance, only 3 out of 113 Lutjanid species currently have available reference genomes, two of which are highly fragmented (>11,000 and >200,000 contigs), impacting studies on gene families relevant to aquaculture. In this study, we present high-quality chromosome-level reference genomes for 14 Australian lutjanid species across seven genera, generated using PacBio HiFi and Dovetail HiC data. We present initial comparative genomic analyses, including immune gene content and chromosomal synteny analyses across species. These analyses provide insights into the genomic architecture and evolutionary relationships within Lutjanidae. Ongoing work aims to comprehensively map and compare the immune gene family repertoire across genera in Lutjanidae, as well as lethrinid species as an outgroup, to determine genus-specific changes in genes (e.g., loss, selection, duplication) important for pathogen detection, antigen presentation, inflammation, and immune memory. These genome assemblies will serve as a foundational resource to the wider scientific community interested in these species.
Charles Darwin University
Spatial population dynamics of declining mammals in fire-prone landscapes
Populations of native small mammals have been disappearing across Northern Australia. Researchers, conservation managers, and indigenous groups have identified the key interacting drivers of altered fire regimes, grazing by introduced herbivores, and predation by feral cats. Fire management offers the most effective landscape-level opportunity for conservation. However, crucial ecological information about dispersal and population dynamics is needed to strengthen management actions. In this talk, I will present results from 1) a fire experiment to collect genetic data before and after a fire, revealing the responses and recovery processes of small mammal populations and 2) an island-wide landscape genetic study to understand how fire and other environmental variables influence population connectivity. These studies provide necessary ecological information to understand species persistence in fire-prone landscapes to inform conservation decisions. Genomic data can provide conservation managers with detailed insights into population dynamics, building a foundation for robust management decisions.
School of Biological Sciences, Nanyang Technological University, Singapore
Genomics perspectives on the evolutionary ecology of South East Asian rainforest flora
The tropical rain forests in South East Asia (SEA) are among the most diverse habitats in the world, and their ecosystem greatly differs from their counterparts in South America and Africa. This is partly due to their unique biogeography where the flora has formed as a mixture of Proto-Malesian species of Laurasian origin as well as Gondwanan flora that migrated to SEA on Indian plate, or through Sahul/Australia connection. Here we provide a genomics perspective to the biodiversity in the SEA tropics. As a collaboration with National Parks Singapore and Nanyang Technological University, we collected and sequenced the genomes of 499 flowering plant species from a local conserved rainforest fragment using shotgun sequencing. Besides providing insights into the plant taxonomy, the work sheds light into genomic measures that might work for conservation purposes and suggests the first steps towards a joint genomic analysis of a forest as one ecosystem. We illustrate how this ecosystems genomics approach can be used to identify stratification of different forest successional stages and provide genomic evidence on the drivers of diversification.
Institute of Evolutionary Biology, Spain
Chromosome-level reference genome for the medically important Arabian horned viper (Cerastes gasperettii)
Venomous animals have traditionally been studied from a proteomic (but also transcriptomic) perspective, overlooking the study of venom from a genomic point of view. However, the rise of genomics has allowed the increase in the number of reference genomes for non-model organisms and tackle questions as venom evolution, from a genomic context. Although venomous snakes are the fundamental model system in venom research, the number of high-quality reference genomes remains limited. In this study, we present a high-quality chromosome-level reference genome for the Arabian horned viper (Cerastes gasperettii), a highly venomous snake native from the Arabian Peninsula. Our highly-contiguous genome has allowed us to delve into macrochromosomal rearrangements within the Viperidae family, as well as across an elapid and a lizard. Furthermore, we have identified a total of ten different toxins conforming the venom’s core, in line with our proteomic results. We also have studied and compared microsintenic changes in the main clusters of toxin genes with those of other venomous snake species, highlighting the pivotal role of gene duplication in the emergence and diversification of the two main toxin families for Cerastes gasperettii.
Institute of Environmental Science and Research, New Zealand
Forensic Universal Animal and Plant Identification
We present a successful massively parallel sequencing workflow used to identify the animal and plant species present in highly processed samples. The goal of this research was to provide a service to enable the enforcement of New Zealand’s conservation and anti-wildlife trafficking laws. We have developed a general and flexible approach to the workflow’s methodology and analysis that has allowed us to cater for the variable nature of these sample types. This presentation discusses the research completed to set up the universal method used for these samples. Topics covered will include the development of the laboratory methodology, bioinformatic analysis strategy, and forensic and ethical considerations encountered. Finally, possible future improvements to the method will be considered, with a view to improving outcomes for wildlife conservation. This workflow is a new service offered by the Forensic Group at ESR and is now offered to New Zealand authorities to aid border crossing investigations and intelligence queries in general forensic casework. We aim to build our capabilities in this area so that this service can play a key role in enforcing wildlife trafficking laws and protecting at risk species.
University of Western Australia
Genome Evolution in Marine Ray-Finned Fishes
Approximately half of extant vertebrate species are fishes, with more than 30,000 species classified as ray-finned fishes (Actinopterygii). Actinopterygii represent diverse phenotypes, feeding strategies, life history traits and occupy distinct ecological niches, making them an ideal taxa for studying molecular drivers of diversity and adaptation. Despite their diversity, ecological, and economical importance, only 145 Illumina genome assemblies are available for marine Actinopterygii species. In this study we present 250 new marine Actinopterygii genome assemblies generated using Illumina whole genome sequencing and initial results from a large-scale study of these 395 genomes. Using reference-based annotation tools we determine which fish families have unique patterns of gene family frequency / structure (e.g., losses, expansions, contractions), and correlate these with predicted functional signatures to infer biological and ecological adaptations. We identify fish families with distinct rates of change in the gene families present within their genomes (e.g., more losses / expansions or diversity) and associate these patterns with increased rates of diversification or speciation to further elucidate the genomic attributes contributing to ecological success. The results of this work contribute to the growing understanding of fish genome evolution and provide new insights into the evolutionary history and ecological success of marine Actinopterygii.
Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
Genome atomization in clitellates and beyond
The organization of genomes into chromosomes is critical for processes such as genetic recombination, environmental adaptation, and speciation. All animals with bilateral symmetry inherited a genome structure from their last common ancestor that has been highly conserved in some taxa but seemingly unconstrained in others. However, the evolutionary forces driving these differences and the processes by which they emerge have remained largely uncharacterized. Here we analyze genome organization across the phylum Annelida using 23 chromosome-level annelid genomes. We find that while many annelid lineages have maintained the conserved bilaterian genome structure, the Clitellata, a group containing leeches and earthworms, possesses completely scrambled genomes. We develop a rearrangement index to quantify the extent of genome structure evolution and show that, compared to the last common ancestor of bilaterians, leeches and earthworms to have amongst the most highly rearranged genomes of any currently sampled species. We further show that bilaterian genomes can be classified into two distinct categories—high and low rearrangement—largely influenced by the presence or absence, respectively, of chromosome fission events. Our findings demonstrate that animal genome structure can be highly variable within a phylum and reveal that genome rearrangement can occur both in a gradual, stepwise fashion or as rapid, all-encompassing changes over short evolutionary timescales.
Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore
Population genomics of critically endangered hawksbill turtles in Singapore unveil genetic vulnerability and urgent conservation needs
Hawksbill turtles are critically endangered and aiding in their recovery would help to protect the coral reef ecosystems and their biodiversity. However, genetic studies on marine turtles have relied on limited genetic markers, including the mitochondrial control region, providing incomplete and biased estimations of genetic diversity and population structure. Here, we generated a de novo genome assembly and high-quality whole-genome population datasets from hawksbill turtles nesting and foraging in Singapore. Our analysis demonstrated a remarkable enhancement in genetic markers. Based on whole genome comparisons of 35 individuals from Singapore, we uncovered approximately 12 million single nucleotide polymorphisms. With this whole genome dataset, we revealed an extent of genetic diversity and demographic history of the hawksbill turtle of Singapore. There is a pronounced degree of inbreeding within the population: most individuals share at least a first cousin relationship. Furthermore, we identified a multiple-paternity nest, while the parents are related. The estimated demographic history suggests the impact of past climate change on the declining hawksbill turtle population. Positive selection was detected in genes critical for environmental sensing and response, significant to natal behavior, emphasizing the strong correlation of turtle survival and climate and environmental changes. The study shows the significance of whole genome sequencing data for studying this species and the vulnerability of Singapore hawksbill turtle population facing climate change and human activities from land reclamation plans at their nesting sites. This project aims to direct conservation efforts for the critically endangered hawksbill turtle population in Singapore and in the Indo-Pacific region.
Institute of Oceanography at University of São Paulo
Seascape genomics of the shortfin mako shark (Isurus oxyrinchus) in the Atlantic and Indian Oceans
The shortfin mako shark (Isurus oxyrinchus) is a highly migratory species found in all tropical and temperate oceans. As a top predator, it plays a crucial role in maintaining healthy and rich marine ecosystems and fish stocks. Currently classified as 'Endangered' by the IUCN, it faces significant threats from the fishing industry. To understand more of this species and, consequently, allowing for more precise and effective conservation efforts, this study aimed to identify the population structuring and investigate whether oceanographic and geographic factors affect gene flow in the Atlantic and Indian Oceans. 235 tissue samples were collected by Portuguese and Brazilian longline fleets. RNAseq and ddRADseq libraries were prepared, yielding 121 samples with 555 SNPs for RNAseq and 136 samples with 216 SNPs for ddRADseq. 8 outlier SNPs from RNAseq and none from ddRADseq were detected. Population structuring was assessed using FST, PCA, DAPC, and admixture analyses, revealing two distinct clusters with low genetic variance, suggesting the existence of two populations: North Atlantic as one and South Atlantic and Indian Ocean as another. Oceanographic variables analyzed included salinity, surface currents, dissolved oxygen, primary productivity, and temperature. Spatial structure was assessed using Cartesian coordinates, shortest by-water path estimates, and dbMEMs. No correlation was found between oceanographic variables and allele frequencies, but a significant correlation was found with dbMEMs, indicating distance as the main factor in genetic differentiation. The findings indicating the existence of two populations are crucial for future conservation efforts and to safeguarding the evolutionary potential of this shark species.
University of Ferrara, Rome
Seahorses in the Anthropocene: conservation gaps & trends
Seahorses (Hippocampus spp.) are flagship animals inspiring numerous conservation programs. They are the first marine genus to be fully listed on the Convention on International Trade in Endangered Species (CITES) Appendix II due to their substantial vulnerability to overexploitation and habitat loss. The peculiar life history of these fishes has been widely addressed through evolutionary and ecological analyses. Yet, no study has integrated current knowledge to approach species-based conservation status, including trends in abundance, diversity, and threats, hindering seahorses’ worldwide effective assessment and management. Here, we bridge these gaps by taking advantage of the available geographic, ecological and genomic (ultra-conserved elements and whole genomes) data in the Hippocampus species at a global scale, and present the most comprehensive study of the seahorses’ conservation status to date. Specifically, we explore seahorses’ diversity patterns in space and time by using species distribution modelling, comparative and conservation genomic applications. We showed that, despite being charismatic animals, their current level of protection is poor, particularly concerning their evolutionary diversity, and we highlighted areas and species to be prioritized. Although genomic erosion appeared not to be an immediate concern in these species, most seahorses revealed a decline in their genome-wide diversity and an increased masked genetic load that may reduce their viability in future. Additionally, we observed that their genetic status was not well described by current conservation indicators. These insights provide a broad, more complete picture on the status and trends of seahorses and inform effective conservation initiatives in the face of the Anthropocene.
University of Melbourne
Selective breeding as a method to combat chytrid susceptibility
Serving as a last resort strategy, assurance colonies provide species with refuge while solutions to their threat/s are sought. Over the past two decades captive assurance colonies for amphibians have increased in response to the lack of effective solutions to the deadly amphibian chytrid fungus. With no remedy in sight, our aim was to investigate the use of selective breeding as a strategy to increase chytrid tolerance in captive assurance colonies. We conducted a disease challenge using 1,000 juvenile southern corroboree frogs (Pseudophryne corroboree) from two captivity assurance colonies. Individual survival and infection loads were measured over the duration of the experiment, with all frogs subsequently genotyped using a custom 50K SNP array. Overall, 30% of the frogs survived with significant differences observed between individuals from the two captive colonies. Genetic diversity and relatedness varied between the two captive colonies, with the more genetically diverse colony demonstrating better performance in the disease challenge. Using the phenotypic and genetic data, we conducted a genome wide association study on seven chytrid associated traits, revealing their polygenic nature. No genes of major effect were identified within this population. Consequently, we calculated heritability estimates for these traits, and breeding values for each individual to model various selective breeding strategies. Our findings suggest that selective breeding strategies could increase tolerance to chytridiomycosis by 6-15% per generation. This work demonstrates that implementing a selective breeding strategy can result in tangible improvements to the conservation outcomes for a critically endangered amphibian.
Queen’s University, Kingston, Ontario
Chromosome-level genome assemblies shed light on anuran genome evolution
Reference genomes have been particularly scarce for amphibians, mostly due to their large genome sizes and high repetitiveness. We assembled a chromosome-level reference genome for the Western Chorus Frog (Pseudacris triseriata) using PacBio Revio HiFi long reads and Dovetail OmniC data and annotated it using RNAseq data from 6 tissue samples and OrthoDB vertebrate protein data. The Western Chorus Frog is a temperate North American tree frog species in the subfamily Acrisinae. Ours is the first annotated genome for this clade and will underpin future evolution and conservation research. Using this new genome along with additional 28 chromosome-level frog genomes, we did comparative genomics and confirmed extremely conserved macrosyteny as has been found in other anurans. We constructed ancestral karyotypes and examined chromosome rearrangement rates in different clades and found ‘rebel’ genes that were not syntenic across all lineages that might be evolutionarily significant. Anurans are known to have genetically-determined sex, but most species have undifferentiated sex chromosomes and this has impeded our understanding of evolution of sex in this group. We identified three candidate sex chromosomes for anurans based on presence of dmrt-family genes and the chromosome synteny. One candidate sex chromosome is supported by our ddRAD sequencing data of Western Chorus Frog that revealed a XY system and male-biased markers mapped to chromosome 1. Overall, our study provides an important genome resource to the community and critical insights on anuran autosomal and sex chromosome evolution.
Oxford Brookes University
Novel Endogenous Retroviruses in Anuran Genomes
Endogenous retroviruses (ERVs) represent remnants of past retroviral infections and have had a pivotal role in vertebrate evolution, contributing toward novel genes and regulatory elements. Although ERV research has predominantly focused on mammals, the increasing availability of genomic data allows exploration of other vertebrate groups. With a similar number of extant species and comparable genome sizes to mammals, we are interested in ERV presence and distribution in the Anuran order of amphibians. We screened 47 publicly available Anuran genomes for the alpharetrovirus genera of ERVs. Previously thought to be avian-specific, four recombinant alpha-like ERV families have recently been identified in two amphibian and two reptile species. Through our screening we identified ten novel families of Anuran ERVs (AnERV1-10) with intact viral genes in 11 species; three of these families are present in different species, with the remainder being unique to the genome in which they were found. Multiple copies of solo LTRs, truncated and full-length elements were found in these genomes for each ERV family. Based on phylogenetic and protein domain analyses, all ten novel ERV families appeared recombined, with two families featuring alpharetrovirus-classified Env proteins, while most had epsilonretrovirus-classified Pol and gammaretrovirus-classified Env proteins. These findings extend the pattern of recombination observed previously in Anurans and adds to the complex evolutionary history of retroviruses. By determining the presence and state of ERVs in Anurans, we can better understand ERV evolution within host genomes and their role across the vertebrate tree of life.
The New Zealand Institute for Plant and Food Research Limited
Recovery of metagenomically assembled genomes (MAGs) from plant-associated microbial communities
All organisms are meta-organisms. The complex interactions between microorganisms and their hosts can significantly influence host fitness, stress responses, and physiology, highlighting their essential role within all organisms. With the current advances in sequencing technologies, shotgun metagenomics provide the possibility to gain comprehensive insight into microbial diversity and functional properties. Furthermore, shotgun metagenomics enables reconstruction of metagenomic assembled genomes (MAGs), which are vital in exploring microbes’ metabolic capabilities and interaction with hosts. However, recovery of MAGs can be hindered by host contamination, low abundance species and complexity of the structure. In this study, we used multiple tissues from mānuka, a native shrub in New Zealand and Australia , to explore the best approaches to maximise the recovery of MAGs from plant-associated microbial communities. We used multiple approaches for assembly and binning to optimize the recovery of MAGs from both prokaryotic and eukaryotic microbes. We will discuss the critical challenges faced, potential workflows, and the key insights gained from our work on MAGs recovery in host-associated complex data sets.
AdvanSentinel Inc.
QuickConc: Fast, Efficient, Power-Free eDNA Capture
Environmental DNA (eDNA) analysis is crucial for non-invasive biodiversity monitoring, revealing species distribution and abundance without ecosystem disruption. Traditional eDNA concentration methods, such as filtration using peristaltic pumps, are labor-intensive and require transporting sampled water. Advancements like Sterivex cartridges and electric water samplers aim to simplify processing but still struggle with efficiency, especially in highly turbid waters. Maximizing eDNA extraction is vital for accurate monitoring, relying on efficient DNA capture, extraction, and preservation. Despite improvements in extraction and preservation, enhancing DNA capture efficiency has been largely overlooked. This research introduces QuickConc, a novel nucleic acid capture method using cationic substances that enhance the interaction between silica and eDNA. QuickConc incorporates dispersible glass fiber sheets, significantly increasing binding efficiency, boosting sensitivity, and facilitating a rapid, power-free on-site filtration process with high DNA yields. Comparative analyses using river, sea, and pond water demonstrated that QuickConc significantly outperformed traditional glass fiber filter and Sterivex methods for extracting eDNA. It also yielded higher amounts of eDNA for specific fish species. Metabarcoding analyses using the MiFish revealed that the number of fish species detected in river water was higher with QuickConc compared to other methods, while in seawater, the number of fish species was similar. QuickConc represents a significant advancement in eDNA analysis, offering a more reliable, efficient, and field-applicable approach to biodiversity monitoring and conservation strategies.
Ankara University, Türkiye
Network analysis of plant preferences of honey bees via metabarcoding
Nutrition is an important part of the well-being of honeybees, for which studies are ongoing on how they choose nectar and pollen sources. Also DNA from honey bees can be metabarcoded to study communities, but little attention has been paid to interactions between plant species. In addition to the effects of early detection or detection of unwanted species, the use of high-efficiency tools is becoming widespread to accelerate applications in terms of the health and traceability of the hives. By using metabarcoding, it is possible to obtain a lot of information such as the traceability of species diversity, flight distances, preferences or health obtained from the intestines of individuals in the hives. In this study, the intestines of 7 different individuals taken from hives subjected to different conditions were used. To obtain plant DNA, after DNA extraction from bee intestines, PCR amplifications were conducted with primers targeting the ITS2 gene for plants. PCR yields were indexed using an adaptor ligation procedure for a 2x250bp pair-ended 100K read per sample and examined by Illumina NovaSeq 6000. The terminal of the Linux/Unix-based operating system was used to analyze the sequencing data using several pipelines. Thus, the success of identifying sample content at the species level using environmental DNA and metabarcoding-based methods, which would be difficult to characterize and identify using standard methods, was evaluated. In the results obtained, matches above 97% were accepted using the NCBI database.
Flanders Research Institute for Agriculture, Fisheries and Food, Belgium
Development of a PCR-free strategy for DNA-based characterization of metazoan bulk samples for environmental monitoring
When using DNA metabarcoding on communities with high phylogenetic diversity, it can be difficult to tailor PCR primers that effectively amplify a marker gene from all species. This results in PCR amplification bias, which affects monitoring data quality derived from metabarcoding. PCR-free approaches (i.e. shotgun metagenomics) can circumvent this issue, but there are two major hurdles preventing the wide applicability of this approach: 1) a lack of reference genomes for most species present in a given environment, and 2) computational intensive pipelines for processing shotgun metagenomic data. We propose a strategy that tackles these two hurdles and apply it to classify shotgun metagenomic reads from macrobenthos samples. We selected 25 macrobenthos species from various phyla for low-coverage Illumina whole genome sequencing. We build a k-mer index database directly from the sequencing reads, thus circumventing tedious genome assembly, that can be used to classify shotgun metagenomic reads using a very fast exact k-mer matching algorithm. We show that low-coverage genome sequencing allows us to build a database that equals the classification potential of a database build with fully assembled reference genomes. We are able to classify a large fraction of metagenomic reads from our samples (up to 96%). Results from shotgun metagenomics algin better with biomass than those from metabarcoding due to the absence of PCR amplification bias. Our strategy provides an easy, fast, and accessible way to assess community composition in metazoan bulk samples by shotgun metagenomics.
North Carolina State University
Know your host: pathogen detection in filth flies (Diptera: Muscidae)
The cost and demand for protein are increasing due to limited resources, prompting the exploration of alternative proteins. Wild-caught insects, such as filth flies, can be considered a supplement to livestock diet as they are highly abundant, easy to collect, and contain high amounts of protein. However, their potential to transmit pathogens raises concerns. Before incorporating these insects into livestock diets, it is crucial to identify any pathogens present, even if disinfection will follow. In this study, we compared the microbiota of flies collected from two different livestock farms to identify potential pathogens. We utilized a combination of shotgun metagenomic sequencing and RNA-seq to detect bacteria, viruses, fungi, and eukaryotic parasites in these filth flies. Taxa were identified using several curated databases with Kraken2, including those from RefSeq, PR2, and EUpathDB. Using the default databases initially led to false positive matches of parasites. However, incorporating taxa known to be associated with filth flies improved accuracy and eliminated false hits. This study underscores the importance of including relevant taxa in current human or veterinary-focused databases to account for the natural parasites of insect hosts.
School of Biomedical Sciences, University of Otago, New Zealand
CRISPR and artificial intelligence for environmental biosecurity
Clustered regularly interspaced short palindromic repeats (CRISPR) and associated proteins (CRISPR-Cas) can be programmed through RNA molecules known as CRISPR RNAs (crRNAs) to target specific gene regions. CRISPR-Cas has, therefore, been presented as a novel diagnostic tool known as CRISPR-Dx when used with isothermal pre-amplification (e.g., RPA – recombinase polymerase amplification). CRISPR-Dx holds great potential for environmental biomonitoring, due to the increased speed, specificity, and sensitivity of assays compared to PCR approaches. These features of CRISPR-Dx are heavily dependent on the quality of the crRNA construct. In this study, we employ an artificial intelligence approach known as ADAPT (Activity-informed Design with All-inclusive Patrolling of Targets), to design highly specific and sensitive assays for important New Zealand invasive species for eDNA biomonitoring purposes, including Sabella spallanzanii (Mediterranean fanworm), and Undaria pinnatifida (wakame). Initial screening results in synthetic DNA have showcased a streamlined crRNA design pipeline with promising specificity and sensitivity, with on-target activity ranging from 100 pM to 10 aM (5 copies/uL) in less than 30 minutes for both species. Similar results in tissue genomic DNA, with sensitivity down to 18 fg/uL and and 34 fg/uL, respectively. Thus, our developed assays showcase an opportunity for eDNA detection for biosecurity approaches using public available genomic data. Therefore, Cas13-based approaches can also be used for conservation efforts (e.g., endangered species). Hence, ADAPT empowers Cas13a-based approaches, streamlining the capability of virtually detecting any organism where reliable genomic information is available.
United States Geological Survey
The genomics reproductive isolation in ecologically divergent lineages of scincid lizards in California
The capacity to sequence whole-genomes for non-model organisms is providing greater opportunities to understand the links between ecological divergence and the evolution of reproductive isolation, an association often confounded by gene flow between populations as they begin adapting to different environments. We will be using whole genome data generated from the California Conservation Genomics Program to identify the genetic underpinnings of ecologically divergent traits that are known to influence pre-mating reproductive isolation in scincid lizards of the Plestiodon skiltonianus species complex. Existing genomic data indicates that introgression has contributed to the process in unexpected ways, with effects likely extending beyond pre-mating barriers to include post-mating barriers as well. One potential post-mating barrier involves decoy tail coloration in juveniles, which is either bright red or blue depending on geography and draws predator attention toward the expendable tail rather than the lizard’s body. While distinctive reds or blues are predominant, some populations have intermediate purplish tails that may reflect gene flow between red and blue populations across clade contact zones – whether color intermediacy increases juvenile vulnerability to predation is unknown, but if so, this could be a powerful reinforcement mechanism that maintains the genetic integrity of red and blue populations. Our goal is to use the whole genome data to identify the genetic basis and associated regulatory pathways of these colors to better understand how different selection pressure promotes lineage divergence in different ecological settings.
UCLA, CCGP
The California Conservation Genomics Project
As we enter the fifth and final year of the CCGP, we are starting to more fully appreciate the power of our dataset to influence conservation at a large spatial scale. In this session, we present synthetic, data-driven analyses of our current multi-species project as well as a handful of case studies of individual taxa. We have also left time at the end of the session for discussion, of how the CCGP can address questions that we haven’t yet considered, and how it can serve as a model for other landscapes represented in the EBP.
UC Berkeley
Historical and contemporary drivers of genetic diversity across California
Conserving genetic diversity is crucial for maintaining population viability and resilience to change. Understanding what drives spatial patterns in genetic diversity is a fundamental question in both population and conservation genetics. Historical and contemporary landscapes may play an important role in shaping genetic diversity patterns by influencing how population size and gene flow vary across space and time. To understand how landscape features influence genetic variation in western fence lizards (Sceloporus occidentalis), we used whole genomes sequenced by the California Conservation Genomics Project for 163 individuals covering the species' geographic and environmental range across California. We used paleoclimate data to determine how historical climate shapes genome-wide genetic diversity. We used contemporary climate and land use data to identify drivers of recent inbreeding and patterns of gene flow. Our results showed that historical climate change was a very strong predictor of genetic diversity across California, potentially due to range expansion following glacial retreat. We found that confounding correlations between historical and contemporary landscapes make it challenging to disentangle contemporary drivers of genetic diversity. These results highlight the important role that historical landscapes have in shaping patterns of contemporary genomic diversity. Both historical and contemporary landscape variables should be considered in order to best conserve genetic diversity for the future.
United States Forest Service
Genome assemblies of at-risk California bumble bees
Many bumble bee species are declining in abundance world-wide due to diverse factors, including habitat loss, pesticide exposure, disease, and climate change. However, some species’ populations appear to be stable or even increasing, with reasons for this disparity unclear. Increased genomic resources for different bumble bee species will aid in determining any genetic bases for these differences, which might be reflected in environmental signatures of selection, as well as help identify conservation units. We assembled and compared highly contiguous and complete de novo genomes of one geographically restricted species, the Sonoran bumble bee (Bombus sonorus Say 1837) and the most widespread and abundant bumble bee in California, the yellow-faced bumble bee (Bombus vosnesenskii Thorp 1983). Genomes were generated from wild-caught individuals, using Pacific Biosciences HiFi long reads and Omni-C data. The final genome for B. sonorus is 0.37 Gb across 282 scaffolds, with a scaffold N50 of 16.2 Mb and a BUSCO score of 97.6%. The genome for B. vosnesenskii is 0.46 Gb across 433 scaffolds, scaffold N50 of 12.5 Mb, and BUSCO score of 97.5%. We will compare these assembled genomes with other closely related published Bombus genomes to infer chromosome assignment and visualize genomic variation between species. By using these assemblies as the reference genomes for future population genetic studies of B. sonorus and B. vonesenskii, we hope to improve conservation planning for both subspecies by assessing geographic patterns of dispersal, relatedness, and genetic structure.
University of California Santa Cruz
Regional community conservation genomics highlight factors affecting population health
Many of the world’s wildlife populations are declining at an alarming rate, precipitating an unprecedented biodiversity crisis. Reductions in population size will eventually lead to isolation and loss of genetic diversity, which can reinforce population decline through mutational meltdown and eventual extinction. An outstanding challenge of conservation genetics research is to develop metrics that reliably predict population health and the factors affecting population change. Measures of genetic diversity often reflect demographic and evolutionary processes during recent glacial cycles, which makes the use of genetic diversity in conservation monitoring an outstanding challenge. One promising avenue is to establish baselines for regional taxa that have experienced recent shared environmental pressures. Here, we leverage a large regional population genomic dataset — representing 221 species across 74 orders of California wildlife and nearly 20,000 individuals — to evaluate the factors that lead to mutation load at the regional scale. We show that estimators of inbreeding, calculated from long runs of homozygosity, are better predictors of recent population changes than genetic diversity metrics, such as nucleotide diversity, that instead reflect long-term dynamics of population change. We summarize which genomic indicators of population health may be useful for informing conservation action and directing management decisions for an entire community of California wildlife.
University of California Berkeley
Patterns of statewide gene flow and population structure impact the adaptive potential of wild populations to environmental change
The ability for populations to adapt to environmental change may determine which species persist or face extinction under human-induced global warming. Both adaptive variation and gene flow can contribute to the potential pace of a species’ response to changing environmental conditions, and a consideration of both evolutionary forces is therefore critical to understanding the evolutionary potential of populations. As part of the California Conservation Genomics Project, we use whole genome sequence data to understand patterns of adaptive variation, gene flow, and population structure boundaries across the state of California. We highlight populations and taxa most in need of urgent conservation attention – those with limited gene flow from populations with climate-adapted alleles – and investigate corridors of connectivity and population structure breaks shared among multiple taxa. We provide a population genomics perspective to potential community-wide responses that species will have to future environmental change, demonstrating that both adaptive potential and gene flow will have outsized impacts on the fate of wild populations.
UC Santa Cruz
A Fast, Reproducible, High-throughput Variant Calling Workflow for Population Genomics
The increasing availability of genomic resequencing data sets and high-quality reference genomes across the tree of life present exciting opportunities for comparative population genomic studies. However, substantial challenges prevent the simple reuse of data across different studies and species, arising from variability in variant calling pipelines, data quality, and the need for computationally intensive reanalysis. Here, we present snpArcher, a flexible and highly efficient workflow designed for the analysis of genomic resequencing data in nonmodel organisms. snpArcher provides a standardized variant calling pipeline and includes modules for variant quality control, data visualization, variant filtering, and other downstream analyses. Implemented in Snakemake, snpArcher is user-friendly, reproducible, and designed to be compatible with high-performance computing clusters and cloud environments. To demonstrate the flexibility of this pipeline, we applied snpArcher to 26 public resequencing data sets from nonmammalian vertebrates. These variant data sets are hosted publicly to enable future comparative population genomic analyses. With its extensibility and the availability of public data sets, snpArcher will contribute to a broader understanding of genetic variation across species by facilitating the rapid use and reuse of large genomic data sets.
Sonoma State University
Genomic variation in a cold adapted leaf beetle
The leaf beetle Chrysomela aeneicollis inhabits cool habitats in montane and coastal regions of California. Prior studies suggest that populations in the Sierra Nevada mountains show evidence of local adaptation at metabolic enzyme loci and at the mitochondrion. Beetles from the Eastern Sierra Nevada show evidence of mitonuclear epistasis with respect to performance, metabolic rate, and components of fitness. Southern Sierra populations (Big Pine Creek, Taboose Creek and South Tuttle Creek) differ from central Sierra Nevada populations (e.g. Rock Creek) at genes coding for mitochondrial tRNA and 16S ribosomal RNA. Across California, coastal populations show greater genomic divergence from montane populations than montane populations from Northern and Central California show from each other. Multiple candidate loci have been identified that are associated with differences between populations experiencing cold, snowy winters and those that occur in the milder climate prevailing along the California coast. In this presentation, we will compare differentiation at the mitochondrion to that observed for the nuclear genome between multiple coastal and montane beetle populations. Other studies predict that mitochondrial differentiation should be greater than differentiation among populations at nuclear genes, a prediction that we test in the current study.
Johns Hopkins University
EBP Policy on Digital Sequence Information (DSI)
Digital Sequence Information (DSI) of the Earth’s biological diversity is fundamental to advances in science, society, and the well-being of humankind. EBP advocates and promotes seven essential principles stated as of 13 January 2022 following the recommendations of the Global Biodiversity Framework of the UN CBD COP15 to assist in fair, universal, and practical protocols for the access and benefit sharing of DSI. These principles will be reviewed and revised, if necessary, according to the discussions and decisions at CBD COP16 in Colombia in October 2024, at which EBP representatives will participate.
University of Waikato
Respecting Indigenous Rights and Interests in EBP ELSI’s Work
Critical to EBP ELSI’s work and to biodiversity research generally, is recognition of and respect for the related rights of Indigenous peoples and local communities with sovereignty claims and cultural and intellectual rights over associated lands, plants, and animals. This talk with explore the ways in which EBP ELSI has reinforced these rights in the development of policy and guidance documents for EBP.
Cambridge University
EBP Data Sharing Policy
EBP ELSI is currently preparing the EBP Data Sharing policy. Proposals under consideration are: we aim for all EBP reference genomes to be deposited in INSDC; the location of sampling and the sample collector should be given in the associated BioSample object; any IPLC / TK assertions of rights should be attached (potentially via LocalContexts labels); if there are restrictions on downstream use, e.g. arising from ABS agreements, then this should be stated and contact information provided for follow-up; we encourage early data release - if an embargo period is necessary then this should be for a maximum of one year. This talk will provide an update on the status of this policy.
Arizona State University
EBP Intellectual Property Policy
One goal of the EPB is to establish reference DNA genomic sequences of eukaryotic organisms. As a global public good, the reference sequence information should be freely available for use anywhere for any purpose. This implies that reference sequences be free of legal encumbrances, including patents or restrictive use licenses. EBP ELSI is currently preparing the EBP Intellectual Property (IP) policy. This talk will provide an update on the status of this policy.
Wise Ancestors
Montañerito Paisa assembly & Wise Ancestors model
The Antioquia Brush-finch (Atlapetes blancae) – known in Spanish as 'Montañerito Paisa' – is a small passerine bird rediscovered in the wild in 2018 in the Colombian Central Andes. The Montañerito Paisa is currently classified as critically endangered, with the main threats faced by the species being the expansion of agricultural activities and deforestation. Since 2019, the Montañerito Paisa Conservation Initiative has established the foundations of community-based conservation initiatives around the species, successfully instilling a sense of pride and responsibility in the community. Wise Ancestors, a non-profit organization affiliated with the Earth Biogenome Project, operates by braiding biotechnology, Indigenous science, and Traditional Ecological Knowledge to conserve biodiversity. Its model relies on a decentralized and collaborative system, where projects are co-developed with local scientists and/or Indigenous Peoples and Local Communities (IPLCs) before being posted to an online platform. There, collaborators who can perform the technical work (labs, sequencing facilities, biobanks, etc) are recruited and work together to complete Conservation Challenges. Local Colombian scientists, community members and Wise Ancestors are partnering to generate a highly contiguous genome and annotation for the Montañerito Paisa and a population genomic study to estimate genome-wide effective population size, genetic diversity and assess prospects of genetic rescue for future management plans. Non-lethal samples from several individuals will also be collected to be biobanked for future studies. During this shared presentation, we will introduce both the Montañerito Paisa project and the Wise Ancestors model, and you will learn how you can take part in the journey.
Sorbonne Law School, Université Paris 1 Panthéon-Sorbonne & Faculty of Law, Université de Montréal
Digital sequence information and indigenous data sovereignty: COP15 decision 15/9 as a revelator of issues at stake in the emerging international law of open science
The Convention on Biological Diversity (CBD, 1992) and its additional Nagoya Protocol (2010) make access to genetic resources conditional on the prior consent of the States or Indigenous Peoples providing these resources, and on the fair and equitable sharing of the benefits arising from their utilization. However, with the development of high-throughput sequencing and the boom in genomics, genetic resources are being massively digitized and turned into freely accessible big data, on the margins of the Nagoya Protocol's regulatory pretensions. COP15 adopted an important decision (15/9) that seeks to find a solution to these controversies, which schematically pit supporters of free access to digitized genetic resources (Northern States, scientists, companies) against opponents (Southern States, Indigenous Peoples). In a delicate exercise of diplomacy, the decision opts for the open access option, while leaving room for competing claims. Dealing with such diverse rights and claims - or, to put it another way, articulating FAIR and CARE - is no easy task. After recalling the origins, terms and stakes of the controversy over digitized genetic resources, I will first show how it reveals the scientific and digital divides between the North and the Global South in terms of access to, use of and benefits from data and digital technologies. Secondly, I will demonstrate how decision 15/9 is an important step in the battle over data sharing and, more broadly, in the construction of an emerging international law of open science, whose issues of distributive justice and equity leave much to be desired.
University of Waikato
Indigenous Data Sovereignty in Biodiversity Genomics
The interface of Indigenous rights, intellectual property, and benefit sharing has seen significant advancements, particularly in the context of the United Nations Convention on Biological Diversity (CBD) and digital sequence information (DSI), in which new provisions around DSI are currently being negotiated. Central to these discussions has been ensuring that Indigenous peoples and local communities (IPLC) receive fair and equitable benefits from its use, in a manner that enables the ongoing protection, and sustainable use of biodiversity. Of key importance to IPLC also, is the recognition of their provenance metadata, and ensuring that Indigenous data sovereignty is upheld and enshrined within regulation. In this presentation KatieLee will share her experiences as part of the IIFB within these discussions, and share recent developments, as well as pathways to create effective change in the absence of clear regulation.
Penn State University
Comparative analysis of adaptive immune systems in ruminant species
A central challenge faced by all organisms is defending themselves against pathogens, including those that are often rapidly evolving. Early in the lineage leading to jawed vertebrates, evolution devised an ingenious solution in the adaptive immune system – in which sets of germline immunoglobulin (IG) and T-cell receptor (TR) genes, collectively called the IG and TR loci, undergo a process called V(D)J recombination that generates an immensely diverse collection of antibodies and T-cell receptors with a potential to recognize a huge variety of pathogens. We have a remarkably limited understanding of what the IG/TR loci, and the resulting receptors look like for essentially all non-model species (including agriculturally important species such as cattle and sheep) as these loci are among the parts of the genome left on the cutting-room floor when reference genomes are released. This is because until very recently, the IG/TR loci had been nearly impossible to assemble as the structural complexity of the regions thwarted standard assemblers designed for short-read sequences. It is only with the advent of long-read sequencing platforms and specialized assembly algorithms over the last several years that researchers were able to reliably conduct population level sampling and variant curation. Recent studies pioneered techniques for estimating the IG/TR gene content from existing genome assemblies and produced the first estimate of the number of V, D, and J genes in a phylogenetically diverse set of mammals. While these methods are a substantial advance, many challenges related to detection of highly diverged IG/TR genes, IG/TR gene verification, IG/TR gene naming, and computational analysis remain practically unaddressed. In this talk, I will present results on the comparative analysis of adaptive immune loci of agricultural species and discuss new immunogenomics challenges.
i3S – Institute for Research and Innovation in Health, University of Porto, Portugal
Dissecting how dividing cells adapt to karyotypic evolution using asian muntjac deer as model systems
The Indian muntjac (Muntiacus muntjak) is the mammal with the lowest known chromosome number, comprising only 2n=6 giant chromosomes in females, thus creating the opportunity to study mitosis in an unprecedent way, as fewer chromosomes and bigger kinetochores make it easier to visualize unknown mechanisms behind the cell division process. The unique cytological features of this unconventional cellular system have allowed us to understand that fundamental steps of mitosis such as chromosome congression, bi-orientation and segregation are biased by kinetochore size (Drpic et al., 2018). Moreover, this system unveiled a key role for Augmin in kinetochore microtubule self-organization and maturation, independently of centrosomes (Almeida et al., 2022). We are now taking advantage of a closely related muntjac species, the Chinese muntjac (Muntiacus reevesi), with similar genome size and composition but distinctively divergent chromosome number and size (2n=46 small chromosomes), to understand how the mitotic machinery adapted to alterations in chromosome number and size. Importantly, we are currently investigating if chromosome number alterations per se render cells more or less dependent on specific mitotic genes that are being therapeutically explored as selective anti-cancer drugs. On a comparative evolutionary approach, we aim to use telomere-to-telomere (T2T) sequencing information of both muntjacs (in collaboration with RT2T consortium) to precisely identify the fusion regions in the Indian muntjac giant chromosomes and, ultimately, understand the chromosomal fusion mechanism that gave rise to the Indian muntjac karyotype from a common ancestor with 2n=70 chromosomes.
University of Missouri - Columbia
DeepVariant-TrioTrain: a customized multi-species approach to improve variant calling in animal genomes
Variant calling in many species remains challenging as most bioinformatics tools have assumptions based on human genomes. Innovative approaches like DeepVariant (DV) consistently outrank rivals in benchmarking competitions. By offering substantially fewer implementation barriers and improve accuracy without joint genotyping, DeepVariant’s appeal grows as more species adopt genomic technologies for the first time. Despite the unknown impacts of giving DV non-human genomes, a lack of viable alternatives has led many to treat DV as a species-agnostic variant caller. Here, we use bovine genomes to evaluate DeepVariant’s behavior in animal species. We create the first multi-species DV model built with cattle, bison, and yak genomes representing 15 trios. Our novel approach, TrioTrain, allows extending DV without Genome in a Bottle (GIAB) reference materials, including a region-shuffling method for SLURM-based clusters. To offset imperfect truth labels for animal genomes, we remove Mendelian discordant variants before training, where the model is tuned to genotype the offspring correctly. After 30 iterations spanning five phases, we observe remarkable performance when benchmarking with the GIAB human genomes (mean SNP F1-Score > 0.990). After training with three F1-hybrid bovine trios, our single-sample model achieves fewer Mendelian Inheritance Errors in HG002, even compared to DeepTrio. Although constrained by imperfect labels, we find that multi-species, trio-based training produces a robust variant calling model. Given the value of comparative training data, we recommend developing trio-based reference materials for more species. Our research demonstrates that exclusively training with human genomes constrains the application of deep-learning approaches for comparative genomics.
USDA-ARS Pollinating Insects Research Unit
Beenome100: Evolutionary dynamics of bee genomes
Bees represent an evolutionarily and ecologically diverse species-rich group. Despite there being around 4000 species within North America, very few have chromosome-level reference genomes that are publicly available. The mission of Beenome100 is to sequence, assemble, and annotate high-quality reference genomes for 100+ U.S. bee pollinators, with a focus on native bees of conservation, management, and agricultural importance. Here, we present analyses of our Phase I data set, covering up to 6 families and 32 tribes of primarily U.S. native bees. We use comparative genomics approaches to explore evolutionary dynamics of genome structure, including genome size, chromosome number, repetitive element content, gene content, and gene family expansions and/or contractions. Our results elucidate broad patterns of genome evolution across bees, and provide an immense resource for further exploration by the research community and beyond.
US National Institutes of Health, NCBI
NCBI Orthologs: Scalability and Precision supporting Biodiversity Genomics
The National Center for Biotechnology Information (NCBI) provides orthologs for over 950 vertebrate and arthropod genomes annotated by the RefSeq project using the Eukaryotic Genome Annotation Pipeline (EGAP). NCBI Orthologs, calculated based on reciprocally best protein sequences and micro-synteny in pairwise comparison to a reference, are highly precise in Quest for Orthologs evaluation, as well as scalable to support the rapid growth of genome resources due to large-scale community efforts, including the Earth BioGenome Project. While initially used by EGAP to accurately project gene names and other information from a model species to non-model genomes, we are expanding NCBI Orthologs to include new 'anchor' reference species with community interests. For example, using honeybee as the anchor reference for Hymenoptera species increases orthologs by 50-70% compared to using Drosophila melanogaster, allowing broader data exploration and nomenclature transfer. Orthologs conserved across the Hymenoptera species are available as ortholog sets through NCBI Datasets. While automatically calculated by EGAP, the flexible organization of NCBI Orthologs enables manual curation based on community input. As the number of RefSeq genomes grows, we anticipate adding NCBI Ortholog anchor references per insect orders based on NCBI Taxonomy and further expansion as community needs arise. We are also exploring reporting paralogs based on PANTHER reference tree graft-points calculated using InterProScan for all RefSeq eukaryotic genomes. We welcome community feedback on NCBI Orthologs and its application. This work was supported by the NCBI of the National Library of Medicine (NLM), National Institutes of Health.
Cómo Naturaleza Canta
Cómo Naturaleza Canta (How Does Nature Sing)
'Cómo Naturaleza Canta' (How Does Nature Sing) is an evocative 30-minute sound and video work exploring the interconnectedness of Colombian ecosystem sounds, music, and the voice of indigenous women and men from communities in the Colombian territory, whose wisdom is crucial in addressing our ongoing climate crisis. At its heart, the piece integrates the voices and knowledge of Mamo Senchina, a Kogui elder; his partner Saga Eulaka and their son Uribe Zarabata, an indigenous family from the Wiwa people from the Sierra Nevada de Santa Marta; Flor María Ogarí, an Embera Eyabida leader from the Resguardo Indígena de Dojura in Chigorodó, Antioquia, and Kamac Pacha (Pablo Tisoy Tandioy), researcher and steward of the ancestral knowledge and practices of the Inga people in Santiago, Putumayo. Together, their voices and experiences deepen our understanding and connection to these critical issues. This collaborative effort is enriched through four distinct layers: a sound art piece by Diana Restrepo, an experimental music composition by Hector Buitrago (Aterciopelados), a song by musician Daniel Roa (Un Bosque Encantado), video, drawing, and generative art by Rafael Puyana, and the heartfelt community work of Lany Arevalo. This performance is a testament to the collective’s interdisciplinary approach and commitment to using sound as a vehicle for sparking change in the human heart towards the healing of nature. VozTerra Collective, founded in 2019, is dedicated to connecting people with the natural world through the transformative power of sound. By recording and amplifying the voices of diverse ecosystems and fostering collaborations between artists and the environment, VozTerra aims to raise awareness about the fragility of our planet. Their innovative projects blend music, sound art, and environmental advocacy, creating impactful experiences that inspire both appreciation and action for the preservation of our planet. Cómo Naturaleza Canta. (How Does Nature Sing) was commissioned for the Biodiversity Genomics Conference 2024 by Tree of Life at Wellcome Sanger Institute.