Although recent studies have revealed the genome-wide distribution of R-loops, our understanding of R-loop formation is still limited. Genomes are known to have a large number of repetitive elements. Emerging evidence suggests that these sequences may play an important regulatory role. However, few studies have investigated the effect of repetitive elements on R-loop formation.
We found different repetitive elements related to R-loop formation in various species. By controlling length and genomic distributions, we observed that satellite, long interspersed nuclear elements (LINEs), and DNA transposons were each specifically enriched for R-loops in humans, fruit flies, and Arabidopsis thaliana, respectively. R-loops also tended to arise in regions of low-complexity or simple repeats across species. We also found that the repetitive elements associated with R-loop formation differ according to developmental stage. For instance, LINEs and long terminal repeat retrotransposons (LTRs) are more likely to contain R-loops in embryos (fruit fly) and then turn out to be low-complexity and simple repeats in post-developmental S2 cells.
Our results indicate that repetitive elements may have species-specific or development-specific regulatory effects on R-loop formation. This work advances our understanding of repetitive elements and R-loop biology.
By first comparing the enrichment of repetitive elements in humans, fruit flies (S2 cells), and A. thaliana, we observed some patterns of relative consistency among species (Fig. 2, middle panel). In the sampling control, we observed that low-complexity and simple repeats tended to be enriched in R-loops among species, while rolling circle (RC) appeared less frequently in R-loops than in random cases. Consistent with the previous report , DNA, LINEs, and LTRs were underrepresented in animal (human and fruit fly) R-loops; however, they were overrepresented in plant (A. thaliana) R-loops. In human and A. thaliana R-loops, short interspersed nuclear elements (SINEs) were underrepresented, whereas rRNA was mildly enriched. We also found some species-specific patterns. For example, satellites were significantly enriched in human R-loops, but significantly under-enriched in fruit fly R-loops. Notably, when we switched the controls to the transcriptional region (GRO control), we found that retroposons and satellites were enriched in human and A. thaliana R-loops, respectively, suggesting a positive correlation between the formation of cis R-loops and these two repetitive elements. Additionally, in humans and A. thaliana, the high degree of overlap between R-loops and transcriptional regions (Fig. 1C) enhances the confidence of the cis formation of R-loops in these two species.
We asked whether the association between repetitive elements and R-loop formation varies during development. For this purpose, we compared embryos and S2 cells of the fruit fly. Surprisingly, in the sampling control, we found that R-loops were more likely to form in regions of the embryo containing LINEs and LTRs (Fig. 2, middle panel). In post-developmental S2 cells, R-loops were highly aggregated in regions with low complexity or simple repeats. However, when we applied the GRO control (Fig. 2, right panel), the above results were relatively attenuated (i.e., low-complexity and simple repeats), as even LINEs and LTRs were not separated in the embryo and S2 cells showing enrichment in the R-loops. Note that, in fruit flies, a small fraction (up to 39%) of R-loops overlapping with the transcribed regions might be a source of this inconsistency (Fig. 1C).
Further, we investigated the relationship between R-loop formation and the repetitive elements in each species at the repeat family level. For humans, we consistently observed an enrichment of R-loops, in the sampling and the GRO controls, containing centr, telo, low-complexity, simple repeat, snRNA and rRNA in the genome (Fig. 3). Interestingly, for the GRO control, we found that SVA repeat elements (belonging to the retroposon class) and satellites preferentially occurred in the R-loops.
We found that repetitive elements contribute differently to R-loop formation among the samples investigated in this study. Human U2OS cells showed that 21.19% of the DRIP-seq signals contained repetitive elements, while 43.19% of the GRO-seq data contained these signatures (Additional file 2A). Further analysis revealed that some repetitive sequences, especially TEs, including LINEs, SINEs, LTRs, and DNA families, did not tend to form R-loops in U2OS cells. On the other hand, low-complexity, satellite, simple repeat, retroposon, snRNA, and rRNA sequences were enriched in R-loop regions compared to non-repetitive sequences in the GRO-seq control. These results are consistent with previous reports [3, 24, 51, 53, 54]. Notably, a recent report has shown that low-complexity and simple repeat sequences are strongly associated with promoter regions , as are R-loop structures [3, 24, 51]. These results suggest that repetitive elements, such as low-complexity and simple repeats, are the key features of R-loop formation in promoter regions. Interestingly, low-complexity sequences have also been shown to be associated with Ezh2 binding, which is a component of polycomb repressive complex 2 (PRC2), and have methyltransferase activity for histone H3 lysine 27 . Another report has shown that R-loop formation is required for the recruitment of PRC2 and repression of a subset of polycomb target genes . These results suggest that R-loop formation involving low-complexity elements could be important for the recruitment of PRC2 and epigenetic regulation of target genes. Therefore, we hypothesize that repetitive elements in R-loop regions might contribute differently to the subsequent function of R-loop formation.
In contrast to human U2OS cells, A. thaliana seedlings showed that 22.25% of the DRIP-seq signals contained repetitive elements, while only 3.08% of the GRO-seq data contained these elements (Additional file 2C). In addition to simple repeats, low-complexity, and satellites, which are prone to form R-loops in human U2OS cells, TEs, including LTRs, DNA transposons, and LINEs, were more preferentially enriched in R-loop regions in A. thaliana seedlings. These results imply that R-loop formation does not simply depend on genomic sequence features but depends highly on the species (or biological contexts). Given that R-loop formation is essential for epigenetic regulation [3, 24, 51], TEs that form R-loops could be critical regulatory elements for gene regulation in A. thaliana seedlings. Further analysis of such factors will reveal the functional significance of R-loop formation in TEs.
To investigate the contribution of repetitive elements in R-loop formation at different developmental stages, we compared the distribution of repetitive elements in R-loop regions between fly embryos and S2 cell lines. In fly embryos, 15.5% of the DRIP-seq signals contained repetitive elements, as compared to only 5.35% of the GRO-seq data (Additional file 2B). In S2 cells, 9.81% of the DRIP-seq signals contained some repetitive elements, while 6.28% of the GRO-seq data contained those elements (Additional file 2B). These results show that repetitive element contribution to R-loop formation is more prominent in embryos than in S2 cells, suggesting that the impact of repetitive elements on R-loop formation remarkably changes in different developmental stages or cell lineages. We also observed that LTRs, LINEs, and satellites were highly enriched in embryo R-loops and were less enriched in S2 R-loops. Conversely, simple repeats and low complexity were relatively enriched in S2 cells and less enriched in embryos. We speculate that repetitive elements could change their function through R-loop formation, along with the developmental context. For example, gypsy, which is known as one of the major insulator elements in flies , is more highly enriched in embryo R-loops than S2 R-loops. R-loop formation on gypsy may alter the function of the insulator or protein complex on insulator bodies, resulting in the downstream regulation of the chromatin compartment. This case is consistent with the recent observation that R-loop formation is associated with an enhancer- and insulator-like state . Further investigation is required to reveal the relationship between R-loop formation and the insulator function of gypsy elements.
Our results highlight the impact of TE elements on R-loop formation, especially at different developmental stages. This suggests that the TE sequence itself could tend to form an R-loop. Because TEs originate from exogenous viruses, they are the target of gene silencing by multiple layers of defense mechanisms to prevent the harmful effects of TE activity. Therefore, R-loop formation involving TEs might be one such mechanism by which cells mitigate the effects of TEs. It has been shown that R-loop formation can stimulate transcription of an antisense sequence, resulting in the formation of heterochromatin [57, 58]. This mechanism is suitable if R-loop formation has a role in silencing TE elements. Similarly, it is reasonable that R-loops have a role in regulatory signals of epigenetic regulations if their functional origin is derived from TE regulation. Moreover, chromatin loosening following the depletion of histone H1 induces the accumulation of R-loops in heterochromatic regions enriched with repetitive elements, including several types of TEs . This result suggests that TE elements could preferentially form R-loop structures, when their silencing by heterochromatin is resolved. This is consistent with the notion that transcribing TE sequences increase the likelihood of R-loop formation. Taken together, R-loop formation might be intimately correlated with TE sequences, although further experimental studies are required to confirm this hypothesis. 153554b96e