The Transposable Elements And The Genome Biology Essay

Published: November 2, 2015 Words: 3719

Transposons are mobile DNA elements, jumping genes that can hop around within genomes or across by horizontal transmission. They are found both in pro and eukaryotic genomes. Barbara McClintock first termed it as 'controlling elements' and it was later replaced by Transposable Elements (TEs). Her pioneering work to understand the nature of the mosaic color patterns of maize seed and its unstable inheritance led to the discovery of Activator (AC) and Dissociator (Ds) elements in Maize. This paved way to understand the dynamic nature of the genome and the control mechanism of gene expression (McClintock, 1950). More than 45% of human genomes have evolved from transposons (Lander et al., 2001) which make them very interesting to study and gives us the opportunity to understand the evolutionary forces that shaped our genome during the course of evolution. High frequency of TEs to transpose, approximately in the range of 103 to 105 per element per generation, provides enough material for evolution to have a major effect on the formation of new species by series of mobilization or loss (Biémont and Vieira, 2006). Apart from its role in evolution, TEs are now emerging as promising vectors used extensively in transgenesis and gene therapy applications.

TEs structure and Transcription

Eukaryotic TEs are broadly divided into two main classes based on their mechanism of transposition. Class 1 elements are retrotransposons which use a RNA-mediated mode of copy and paste transposition and Class 2 elements are DNA transposons use a DNA based cut and paste transposition (Figure 1A). Retrotransposons account for 42% of the human genome approximately (Lander et al., 2001), the RNA is reverse transcribed into DNA by element encoded reverse transcriptase which integrates them into a new genomic site. They are further subdivided into Long Terminal Repeat (LTR) and non-LTR elements. Human endogenous retroviruses (HERVs), which resemble reterovirus in the structure and mechanism, are classified under LTR due to a non-functional envelope gene which prevents their extra cellular transmission. The majority of Class-I elements present in the human genome are mainly non-LTR long interspersed elements (LINEs or L1s) (Lander et al., 2001). The trans activity occurs seldom by their retero transpositional machinery that mobilizes non-autonomous short interspersed nuclear elements (SINES) like Alu and SVA elements, which form processed pseudogenes (Boeke and Corces, 1989; Ostertag and Jr, 2001; Wei et al., 2001). Even though the vast majority of reterotransposons are inactive, an average human genome consists of active 80-100 L1 elements (Brouha et al., 2003).

Figure 1. Types of transposable elements and mechanism of transposon mobilization. Transposable elements fall into two major groups: (A) DNA Transposon encode one single gene, the transposase, flanked by two terminal inverted repeats (IR). The transposase enzyme r recognizes the IRs, excises the element and inserts it into a new genomic locus elsewhere. Integration and excision sites are sealed with the help of host DNA repair enzyme. (B) and (C) Reteroelements (long terminal repeat (LTR) and non-LTR type) move via an RNA-intermediate and encode a reverse transcriptase (RT), integarase (IN) and endonuclease (EN). Each of this group contains autonomous and non-autonomous elements. Non-autonomous elements do not encode functional components required for transposition, but instead they depend on the trans action of their autonomous counterpart for their mobilization. Modified after (Levin and Moran, 2011)

In nature, transposons either exist as a complete functional copy which encodes all components (transposase and recognition sequence) or as a non-autonomous copy with the recognition sequence to be mobilized in trans encoded by an autonomous copy. Transcription of a transposon is the most essential and fundamental step towards transposition. They either relay on their intrinsic promoter or an adjacent genomic promoter. Reterovirus and LTR elements use their LTR as a strong promoter/enhancer signal for transcription which is required for the successful propagation to the next generation in the given organism (Boeke and Corces, 1989) Non-LTR elements like L1 have been shown to have intrinsic promoter activity in their 5' untranslatable region (UTR) (Swergold, 1990).

A DNA transposon consists of a gene encoding transposase flanked by two terminal inverted repeats (IRs). The transposase enzyme recognizes and binds the two terminal inverted repeats (IRs), cuts out the DNA and reinsert into a new genomic location. Eukaryotic transposons can be subdivided into three major classes based on the mechanism of transposition.

Classical "cut-and-paste" transposons of double stranded DNA

Helitrons which transpose using the mechanism of rolling circle replication (Kapitonov and Jurka, 2001)

Mavericks encodes its own DNA polymerase and transpose through a replicative, copy-and-paste process (Pritham et al., 2007).

DNA transposons present in the human genome belong to Tc-1/mariner super‐family (i.e. mariner, MER2-Tigger, Tc2), hATsuper‐family (i.e. MER-1-charlie, Zaphod), and some PiggyBac like elements. The two major classes have been subdivided into super-families and then into families on the basis of the transposition mechanism, sequence similarities and structural relationships. (Pace and Feschotte, 2007). Most of the DNA transposons are characterized by the presence of their own internal promoter or rely on the expression of the neighboring genes to transcribe their transposase for transposition. Examples for DNA transposons that transpose with their own promoter include p elements in flies (Kaufman et al., 1989) Ac and mutator elements in plants (Fridlender et al., 1998) (Raina et al., 1993). Promoter activity is not always conserved among the members of the same family. In case of Tc-1/mariner, Pot2 elements in plants, have their own promoter activity both in sense and antisense direction (Kimura and Yamaguchi, 1998) and Tc3 in C.elegans (Moldt et al., 2007) and other members such as Tc1 transposons depend on fortuitous read-through transcription from adjacent genomic promoters. (Sijen and Plasterk, 2003).

Host mechanism regulates TEs expression

TEs are often called as 'selfish' or 'parasitic elements as their success is inversely proportional to the fitness of the host (Slotkin and Martienssen, 2007). The activity of TEs is often considered as deleterious to the host, as they are highly mutagenic and their insertion can alter the regulation and expression of flanking genes in many ways. Although most of the transposons are inactive due to mutations, they can remain intact but silent in host genomes. To minimize the pernicious nature of transposons, the host genome has evolved a number of epigenetic 'defense' mechanisms which are interdependent pathways such as noncoding small RNAs, DNA methylation and chromatin modifications to protect the harmful effects of TEs.

piRNA and RNAi silencing of TEs

Small interfering RNAs (siRNAs) are short stretch of 21-30 nucleotide RNAs molecule cleaved by dicer family of proteins by a mechanism called RNA interference (RNAi). Argonaute proteins constitute the catalytic components of the RNA-induced silencing complex (RISC). Tc1 is the most abundant family of transposons in C. elegans that is silenced by RNAi in the germ line (Sijen and Plasterk, 2003). Importantly, the mutations in argonaute- and dicer-family proteins cause the remobilization of TEs in many eukaryotic species. Endogenous siRNAs, which are present in mammals are extensively contributed by the activity of TEs (Watanabe et al., 2006). L1 retrotransposons accounting for the 17% of the human genome is silenced by RNAi by the sense and antisense activity of its promoter (Yang and Kazazian, 2006). Piwi-interacting RNAs (piRNAs) are novel class of small RNAs of ∼30 nt in length expressed during male germ cell development is associated with MIWI, a spermatogenesis-specific PIWI subfamily member of the Argonaute protein family for its biogenesis and/or stability (Grivna et al., 2006). PiRNAs are arranged in clusters in mice and humans suggesting a common function in germ cell development (Aravin et al., 2007). Drosophila piwi gene is shown to be very important for germ-line stem cell (GSCs) which is highly conserved in C. elegans and humans. Moreover, Gypsy, P & I elements are all silenced in Drosophila germ line, where argonaute proteins Piwi are actively expressed (Cox et al., 1998) (Rehwinkel et al., 2006). Structural features of TEs might help to distinguish their transcripts from the host gene for the specific targeting by piRNA and RNAi pathway.

Figure 1. Small RNA (sRNA) mediated gene silencing pathways. dsRNA triggers (represented by the hairpin), which are derived from the inverted terminal repeats of a DNA transposon are processed and cleaved into 21-24 nucleotide (nt) siRNAs by the Dicer family of proteins. Those siRNA guide the Argonaute proteins to complementary messenger RNAs (mRNAs) and mediate their degradation or translational repression. Germline-specific, Piwi-interacting RNAs are processed from long single-stranded RNA, often antisense transcripts of transposons. Binding of the mature piRNA by the PIWI or Aubergine (AUB) proteins allows it to be directed to complementary sequences in TE mRNA. Endonucleolytic cleavage of the mRNA, 10 nt from the 5′ end of the small RNA, and 3′ cleavage/processing liberates a secondary sense-strand transposon piRNA, which associates with the Argonaute 3 (AGO3) protein. The binding of this complex to complementary sequences in the original precursor piRNA, followed by endonucleolytic cleavage, regenerates an antisense piRNA that can be directed to TE mRNA on the transcriptional and post-transcriptional level. Modified after (Levin and Moran, 2011)

DNA methylation and chromatin modifications

DNA methylation functions as an important epigenetic marker that plays a pivotal role in regulating gene expression and during development. DNA methylation patterns which are set in germ cell during gametogenesis are largely erased in embryogenesis & reset after implantation (Gaudet et al., 2004). Cytosine residues in DNA can be both methylated in plants as well as in animals. They are generally carried out by a family of DNA methyltransferases enzymes. In mouse, the methylation of retrotransposon intracisternal A-type particle (IAP), suppress its expression in embryogenesis. But the inactivation of DNMT1, DNA methyltransferase responsible for the maintenance of DNA methylation, leads to elevated levels of IAP elements (Walsh et al., 1998).

Figure 1. Developmental triggers of transposition. Germline transposable element (TE) integration events can result from TE mobility in cells that give rise to gametes or from TE mobility post-fertilization during early development. Embryonic TE mobility in cells that do not contribute to the germ line or mobility at later developmental stages can, in principle, lead to somatic TE integration events. Modified after (Levin and Moran, 2011)

DNA methylation plays an important role in chromatin condensation and packaging by modifications of histone amino (N)-terminal tails which alter its affinity for transcription factors. TE dominated regions in nucleosomes are enriched for methylation of histone H3 at lysine 9 (H3K9) that associated with transcriptional repression (Martens et al., 2005). In Arabidopsis, RNA-directed DNA methylation can be induced by double stranded RNA by RNAi. DICER-LIKE 3 (DCL3) recruits components that signal DNA methylation in a manner independent of its catalytic activity and generates a larger 24-26-nucleotide siRNA that forms a complex with Argonaute4 (AGO4), and this silencesTEs by asymmetrical DNA methylation (Qi et al., 2006). Transcriptional gene silencing (TGS) mediated by RNAi is poorly understood in mammals but it is known that artificially introduced siRNAs can bind DNA methyltransferase Dnmt3a to direct DNA methylation in human cells (Weinberg et al., 2006).

Co-adaptation of TEs for host functions

During the course of evolution, the abundant presence of transposons with and their ability to induce mutations led to their domestication (Miller et al., 1999). They are known to be involved in the regulatory networks such as, providing an enhancer signal that can rewire the host gene expression. It is now experimentally proved that almost 25% of human promoters contain sequence from TEs (Jordan et al., 2003). An eutherian-specific transposable element (MER20) has contributed to the origin of placental specific gene regulatory network in mammals by augmenting cAMP signaling pathway in endometrial stromal cells (Lynch et al., 2011). Maintenance of intact Drosophila telomere is associated with terminal transpositions of specialized retrotransposons TART (telomere-associated retrotransposon) and HeT-A (Levis et al., 1993). LTR class I endogenous retrovirus (ERV) retroelements have evolved to have 1,500 binding sites for p53 accounting for 30% of all p53 binding sites. These ERV elements are primate specific and give us a thorough understanding of endogenous retroviruses that shape the transcriptional network of a human tumor suppressor protein p53 (Wang et al., 2007). V(D)J recombination is a specialized DNA rearrangement of variable (V), diversity (D), and joining (J) gene segments which played an important role in the development of vertebrate immune system. The RAG1 and RAG2 proteins are the essential components that interact to form the recombinase responsible for the joining and transfer activities. Recombination signal sequences (RSS) adjoining the V (variable), D (diversity), and J (joining) segments are responsible for the sequence specific cleavage and joining by the RAG1/2 protein complex (Gellert, 2002). The entire mechanism is very much reminiscent of a transposition reaction and RAG1/2 can catalyze transposition of a DNA segment flanked by RSS in vitro (Agrawal et al., 1998). The catalytic core of the RAG1 has evolved from Transib elements, a group of DNA transposons identified in invertebrates & the structure of RSS shows characteristics of sequence similarity with TIR of Transib Transposon (Kapitonov and Jurka, 2005). Together with all these findings, V(D)J recombination represent the successful co-adaptation & domestication of transposons with the host machinery.

Tc1/Mariner family & Sleeping Beauty(SB) Transposon system

Tc1/Mariner families are the most widespread DNA transposon family found in eukaryotes ranging from plants to humans (Plasterk, 1999) They vary in length from 1.3 to 2.4 kb and encode a single transposase enzyme (Ivics et al., 1997; Robertson, 1993) It was found as a repetitive element in C.elegans (van Luenen et al., 1993) and when transposition was detected, it was called Tc1 (for transposon Caenorhaditis number 1). The other members of the mariner family are mostly found in different fly species (Bigot et al., 1994; Capy et al., 1994) but since been reported in humans. The transposase protein of the family is characterized by the presence of DDE or DDD motif present in most transposase and integrase, with a terminal inverted repeats and preferential integration for TA sequence (Vos and Plasterk, 1994). Unfortunately, most of the members of Tc1/Mariner family are inactive, rendered by "vertical inactivation" (Lohe et al., 1995). Furthermore, endogenous Tc1-like element (TcE) of Drosophila hydei, were successfully used for germline transgenesis of the fly Ceratitis capitata (Loukeris et al., 1995). The molecular reconstruction from the multiple inactive Tc1/mariner elements in fish resulted in the resurrection of an active transposon named Sleeping Beauty (SB). SB is the most active and studied transposon in vertebrates for two main reasons. It gave an opportunity to understand the host-transposon regulation and valuable non-viral based vector to be used in vertebrate transgenesis and gene therapy. The SB transposon system consists of two main components, the transposon with the terminal inverted repeats (IRs) on both sides with each containing two transposase binding sites (DRs) and the transposase. IRs are 230 bp long with two 32 bp imperfect direct repeats (DR) are not equal on both sides with left IR containing UTR region which can function as an enhancer for transcribing SB transposase in its native arrangement. SB consists of 340 amino acids with its N-terminal paired-like DNA-binding domain recognizing the IRs and overlapping nuclealar localizing signal (NLS) involved in nuclear transport and the C-terminal with the characteristic DDE signature is responsible for the catalytic activity involved in DNA cleavage, strand transfer and joining reaction (Ivics et al., 1997).

Figure 1. The structure of the Sleeping Beautytransposase. Sleeping Beautyconsists of paired like DNA binding domain, Nuclear Localization Signal (NLS) and the C terminal catalytic domain.

A typical SB transposition starts with the binding of the transposase to its IRs followed by the synaptic complex formation in which the two transposon ends are brought proximal together. It is then excised from the donor locus and reintegrates into a new locus which in turn creates a 5 bp foot print mutation in the donor site (Ivics et al., 1997).

Figure 1. Schematic representation of the cut-and-paste transposition. The transposase gene (blue box) is flanked by the inverted repeats (IR; grey arrows). The transposase (green circle), the only protein needed for the transposition reaction, binds to the inverted repeats, catalyzes the excision of the transposable element from the donor locus (green lines) and mediates the integration of the element into a new DNA locus (yellow lines). DNA breaks at the integration and donor site are repaired by the host DNA repair machinery.

The detailed steps are elaborated by a schematic diagram. SB Transposase can recognize IRs either in cis or trans arrangement which makes it possible to physically separate the transposase gene from the IRs. The trans arrangement makes it easy to clone any gene of interest between the IRs (Figure ) with the transposase driven by a strong promoter will thus serve as a valuable for genome engineering. But the efficiency of transposition reaction as a two component system can be limited by a phenomenon termed overproduction inhibition which is wide spread in Tc1/mariner family & the cargo capacity of cloned DNA insert between the IRs (Grabundzija et al., 2010).

Figure 1. The Sleeping Beautytransposon system. (A) Natural arrangement of the Sleeping Beautytransposon. The transposase gene (blue box) is flanked by the inverted repeats (IR; grey arrows) that contain the transposase binding sites (DR; white arrows). (B) Laboratory arrangement of the Sleeping Beautygene transfer vector system. The transposase coding region is replaced by a gene of interest (green box). The transposase is provided on a separate plasmid vector expressed from a suitable promoter (blue arrow).

Systematic study for the most efficient transposon vector system among Tc1/mariner family like Tc1, Tc3, Himar1 and Mos1 with invitro mammalian cell culture assay determined that SB is the most efficient system(Fischer et al., 2001). SB has a target site preference for palindromic AT-repeat, ATATATAT, in which the central TA is the canonical target site & upon transposition undergoes TA dinucleotide duplication in the target site repaired by host repair machinery. It is the most active transposon system of Tc1/Mariner family used in genome manipulation and various gene therapy applications in vertebrates (reviewed in Mátés et al., 2007). Molecular reconstruction by in vitro evolution lead to the generation of novel hyperactive forms of SB in which, SB100x is the most active and supported 35-50% stable gene transfer in human primary cells & 45% stable transgenesis in mouse zygotes (Mátés et al., 2009)

Host Factors & Sleeping Beautyregulation

SB transposon has wide range of activity in vertebrates with different efficiency & among cells of different tissues of the same species. Possible explanation for such difference in efficiency can be attributed to the interaction of the transposition machinery with host factors. Nevertheless, if host proteins do indeed participate in the transposition reaction, they must be conserved in vertebrates. A highly conserved DNA-bending protein belonging to the high-mobility group of proteins, HMGB1, was first identified as a cofactor of SB transposition necessary for transposes-transposon complexes at the internal DRs (Zayed, 2003). Transposition of SB leads to DNA double strand breaks, which are shown to be repaired by host repair machinery.Ku70, an important protein involved in non-homologous end joining repair pathway interacts physically with SB transposase, establishing a functional link between the host DNA repair machinery and transposase.(Izsvák et al., 2004).

The epigenetic modification of SB transposable element by CpG methylation within the transposon sequence enhances the transposition frequency of the SB transposon (Yusa et al., 2004). SB, by its interaction with another host encoded protein Miz-1 a transcription factor down-regulates cyclin D1 expression in human cells and induces G1 slowdown, which can be seen as selfish act for maximal transpositional event (Walisko et al., 2006).

HMG2L1 induces transcription of the transposon 5′-UTR

The high-mobility-group-box (HMGB) proteins are one of the three HMG chromosomal protein super families which can be further classified into two major subgroups: Group 1 proteins have more than one HMGB domain with a long acidic C-terminal tail without particular sequence specificity. Group 2 proteins contain only a single HMGB domain with some degree of sequence specificity and can act as a transcriptional factor (Bustin, 1999). It is important to note that HMG2L1 belongs to the second subgroup of HMGB proteins. It has been shown that HMG2L1 negatively regulates Wnt signaling by interacting with a novel NLK-binding protein (Yamada et al., 2003). Further studies has shown its role in attenuating smooth muscle differentiation (Zhou et al., 2010). In search of other host proteins by yeast two hybrid screen that can possibly interact with SB transposase, resulted in another high mobility protein HMG2L1 (high-mobility group protein 2-like 1).This interacts with SB transposase and then binds on it's 5' untranslated region thereby, driving its expression although, in the presence of SB transposase, the HMG2L1 is negatively regulated by feedback inhibition (Walisko et al., 2008).

Aims and Objectives

Transposable elements are often regarded as selfish DNA parasites that are rarely co-opted by the genome to serve a beneficial role. But the high level of transposition might negatively affect the fitness of the host, suggesting a tight control in the regulation of transposable element within the cellular environment. The Sleeping Beauty (SB) transposon is a member of the Tc1/mariner superfamily of DNA transposons, mobilized via cut and paste mechanism. In this study, Sleeping Beauty (SB) transposable element was used as a tool to investigate transposon-host cell interactions in vertebrates.

Transposition of Sleeping Beauty (SB) is highly regulated by host cellular factors. Expression of transposase is driven by its own promoter present in the 5'UTR region of the transposon. Notably, transcription is significantly upregulated by a cellular protein, HMG2L1. As HMG2L1 is a poorly characterized protein, studying the protein interaction network that regulates the activity of HMG2L1 protein was of great interest , as this in turn modulates the transposase transcription and expression.

Sleeping Beauty (SB) transposon shows efficient transposition in vertebrate cells and in cells of different tissues of the same species, but with different efficiencies. A possible explanation is the expression and regulation of host factors that could regulate the efficiency of the transposition. Using zebrafish as a model organism, this study aimed to decipher the developmental expression of HMG2L1 and its role in the dynamic regulation of the transposase in early embryonic and germ cell development.

Sleeping Beauty (SB) based integration systems have been widely used for genetic manipulation of vertebrates. The aim was to harness the power of transposon-based transgenics coupled with recombination mediated cassette exchange (RMCE) to create a model transgenic rat, wherein a transgene of interest can be retargeted into the transposons tagged genomic loci to circumvent the problems of pronuclear injection based transgenesis.