We have a new paper regarding variation of retrotransposed genes among healthy individuals:
In primates and other animals reverse transcription of mRNA followed by genomic integration creates retroduplications. Expressed retroduplications are either ‘retrogenes’ coding for functioning proteins or expressed ‘processed pseudogenes’, which can function as noncoding RNAs. To date, little is known about the variation in retroduplications in terms of their presence or absence across individuals in the human population. We developed new methodologies allowing us to identify ‘novel’ retroduplications (i.e., those not present in the reference genome), to find their insertion points, and to genotype them. Using these methods, we catalogued and analyzed 174 retroduplication variants in almost one thousand humans, which were sequenced as part of Phase 1 of the 1000 Genomes Project. The accuracy of our dataset was corroborated by (i) multiple lines of sequencing evidence for retroduplication (e.g., depth of coverage in exons vs. introns), (ii) experimental validation, and (iii) the fact that we can reconstruct a correct phylogenetic tree of human sub-populations based solely on retroduplications. We also show that parent genes of retroduplication variants tend to be expressed at the M-to-G1 transition in the cell cycle, and that M-to-G1 expressed genes have more copies of fixed retroduplications than genes expressed at other times. These findings suggest that cell division is coupled to retrotransposition and perhaps, is even a requirement for it.