A fully resolved chromosome-level genome assembly of a tapeworm provides the first full picture a representative of one of the three main branches of animal life, Lophotrochozoa, providing an error-free resource for sampling and revealing unexpected information about the evolution of chromosomes.
Tapeworms are ubiquitous parasites of all classes of vertebrates and have complex life cycles that typically involve at least one invertebrate intermediate host and one vertebrate final host in which the adult segmented worm resides in the intestinal system. Human infection is most serious when it involves us playing the role of an intermediate host, acquiring larval forms of the tapeworm that localize outside the enteric system, often in association with the central nervous system. For example, a human infection with the larvae of the “ pig ” tapeworm Taenia solium is estimated to be responsible for a third of epilepsy cases in Latin America.
While these species are important to study, their life cycles cannot be practically maintained in the laboratory and, therefore, much of our fundamental understanding of tapeworm biology is instead based on species whose life cycle involves beetles and rodents, as these hosts are themselves used as laboratory models.
The characterization of the genomes of parasitic worms led by the Parasite Genomics Group at the Sanger Institute represents one of the most important global advances in our efforts to overcome the chronic diseases caused by these pathogens. In about a decade, the genomes of the most important parasitic species of flatworms (i.e. platyhelminths) and roundworms (i.e. nematodes) have been characterized and data put available to all.
Some of these genomes have now been assembled at the level of complete chromosomes, making it possible to investigate not only the contents of the genome but also the genetic landscape arranged along their chromosomes. One of them is the mouse bile duct tapeworm, Hymenolepis microstome, an important laboratory model for which a genome project was published in 2013.
Bring it together
The first genome-level sequencing technologies were based on a divide and conquer approach: the genome is fragmented into millions of short pieces (hundreds of bases) which are then sequenced in parallel, generating millions of ‘reads’ ‘short which must be assembled on a computer. Although these technologies are sufficient to cover all or most of the bases, short reads are problematic to assemble, as repetitive and low complexity sequences in the genome mean that many reads cannot be unambiguously aligned to positions. unique.
As a result, the most characterized genomes to date are still made up of sequence fragments far more unassembled than the number of chromosomes in the organism, obscuring their syntenic relationships (i.e. the relative positions of the various elements genetics).
Newer technologies allow the sequencing of long reads – very long reads – generating contiguous sequences of hundreds of thousands to millions of bases, while complementary approaches such as optical mapping (a non-sequence-based approach to mapping (physically the relative positions of chromosome fragments) provides additional evidence to facilitate higher level assembly. These technologies were used to transform the genome project of H. microstome into a fully assembled chromosomal-level reference genome – the first fully resolved genome of a representative of Lophotrochozoa: the large animal group encompassing mollusks, annelids, flatworms, and a wide range of smaller phyla of invertebrate animals.
Most important junk
A fully characterized and assembled genome is invaluable in research for many reasons, not least because it is free from sampling errors (for example, is a gene really missing or the genome is missing). not been fully characterized?). Beyond the content of the genome, it also offers the opportunity to study its architecture: how the different elements of the genome – from the parts that code for proteins to those that represent genomic invaders – are arranged along the different chromosomes ( the longest contiguous stretches of a eukaryotic genome).
It has been known from the early days of sequencing the human genome that a large part of it consists of short, non-coding sequence motifs. Originally known as “unwanted DNA”, their importance to the evolution of the genome is only beginning to be appreciated. These sequences are the result of “transposable elements” (TE) which are pieces of foreign viral DNA which are incorporated into the genome and are variously eliminated or amplified in copy number during evolution. All eukaryotic genomes studied to date contain TEs, which may represent more than half of the genome of some species.
Today, it is accepted that far from rubbish, ETs are in fact responsible for some of the most important aspects of genome evolution, such as gene duplication and rearrangement. But they are also responsible for the evolution of the linear chromosomes themselves (a characteristic of eukaryotes) which are “capped” (terminated) by short sequence motifs called telomeres. These 6 base repeats work to maintain both linearity and full length of chromosomes during replication.
Meanwhile, a much longer sequence motif (~ 370 bases) called the centromere acts as the spindle attachment site, allowing homologous chromosomes to separate during cell division. Centromeres also have their evolutionary origins in TEs, which paradoxically have species-specific sequence identities guided by the dynamic evolution of TEs, although they play a fully conserved and fundamental role in mitosis.
Losing your telomeres
Complete assembly of the H. microstoma genome revealed that its chromosomes are capped by telomeres at one end, while the opposite ends instead end in what turned out to be arrays of centromeric sequences. Conventionally, the position of the centromere relative to the ends of chromosomes has been used to describe the “karyotype” of a species. Those found near the ends of the chromosome are known as “telocentric” (“near the telomeres”) and, given the limited resolving capabilities of karyological techniques, it has been assumed that they nevertheless end with the telomere sequence. .
However, the H. microstoma the genome shows definitively that chromosomes can indeed terminate in centromeric networks, which have probably come to replace telomeres during evolution.
Being terminal requires the centromere to act as a telomere in protecting the chromosomal ends. At the same time, it must also retain its ancestral role as a substrate for the fixation of the spindle during cell division. However, evolving to play a dual role is far from straightforward, as there are telomere-specific proteins that interact directly with telomeric sequences to maintain chromosome-length homeostasis, and these presumably must also interact with them. the motif of the centromere sequence in the case of H. microstoma. This and other implications for the underlying mechanisms that orchestrate these fundamental processes in H. microstoma require further investigation. Meanwhile, whether or not terminal centromeres are found in other species described with telocentric karyotypes awaits the complete assembly of additional genomes.
Parasites are just organisms
It is common vanity even among biologists to view parasites as if they were completely separate from free animals, having nothing to teach us (ultimately) about our own biology. Likewise, any novelty in their biology is typically linked to their parasitic lifestyles. But parasitism is simply a trophic strategy which has been exploited by at least a few lineages within the majority of large groups of organisms; and few would argue, for example, that we have nothing to learn from the study of herbivores except to understand herbivores. These findings in a tapeworm point to fundamental lessons about chromosome evolution applicable to all organisms – not the effects of “parasitism” on the genome.