|
Researchers identify primate-specific sequences in the human genome in comparative genomics study
07-20-2009
SHARING OPTIONS:
DETROIT, Mich.—In comparative genomic research project using
data that is already available in the public domain, researchers from Wayne
State University (WSU) and the Genome Institute of Singapore have identified
primate-specific sequences in the human genome. According to the researchers,
who summarized their findings in the July 6 online edition of the Proceedings
of the National Academy of Sciences, the study has many implications for the
field of genomics research.
According to the researchers, the study, "Global discovery
of primate-specific genes in the human genome," offers an explanation for
lineage-specific uniqueness that is based on something completely new in
evolution, not on changes to old sequences or structures. Perhaps more
importantly, the researchers believe the study itself provides an interesting
critique of current genomic research methods.
The researchers began their quest to find primate-specific
genes by noting that despite the increasing availability of genome and
transcriptome sequence data, the genomic basis of primate phenotypic uniqueness
remains obscure. According to Dr. Leonard Lipovich, assistant professor of the
Center for Molecular Medicine and Genetics and Department of Neurology at WSU's
School of Medicine and principal investigator of the study, this challenge is
due to multiple factors.
First, searching for non-conserved genes isn't emphasized by
any of the major players in genomics research, Lipovich says. Although factors
such as segmental duplications and positive selection have received much
attention as potential drivers of primate phenotypes, single-copy
primate-specific genes are poorly characterized, he says.
"There is a seldom-challenged assumption in the
genomics field that functional genes must be broadly evolutionarily conserved
and protein-coding," Lipovich says. "You hear a lot about using
genome and transcriptome data to look at conserved genes, but investigators
tend to ignore genomic intervals outside of those known genes. The efforts that
are out there are unimaginative and focus primarily on finding homologs of
known protein-coding genes in additional species, not on non-conserved genes
and their possible role in the genomic basis of interspecies
distinctions."
A second challenge, Lipovich notes, is that too much genomic
and transcriptiome sequencing is being done without sufficient downstream
efforts to analyze the sequence data.
"The fact that we have genome and transcriptome
databases is not, by itself, helpful," he notes. "What might be
helpful is developing new algorithmic approaches. In addition, these datasets
frequently are not put together in a way that can help test specific
hypotheses."
The Genome Institute of Singapore's Sen-Kwan Tay, who worked
on the study as an extension of a dissertation for his M.Sc. degree in
bioinformatics, adds that data on the genomes of humans and our nearest
relative, the chimpanzee, show a 99 percent similarity in their sequences.
Explanations for the substantial phenotypic differences between the two species
not only include sequence differences, but also regulatory and genome structure
differences and species-specific indels, Tay says.
"While the genome and transcriptome sequence data provide a
lot of what we know about interspecies sequence and genomic structure
differences, we still don't understand exactly how, mechanistically, these
differences lead to phenotypic differences such as the uniquely higher
cognitive capacity in humans, etc.," Tay says.
To address both of these concerns, the researchers screened
a catalog of 38,037 human transcriptional units (TUs), compiled from EST and
cDNA sequences in conjunction with the FANTOM3 transcriptome project and
interrogated the intersection of transcriptome data and multispecies genome
alignments to search for primate-specific genes. The comparative study, using
transcriptome sequencing and transcript-to-genome alignments, mapped the human
transcripts from FANTOM against the genomes of a number of organisms, including
the chimpanzee, to discover de novo gene genesis.
"We searched for new classes of interspecies differences,
specifically entirely new genes in primates, because such genes might provide
another explanation for lineage-specific uniqueness that is based on something
completely new in evolution, not on changes to old sequences or structures,"
Tay explains.
The researchers identified 131 TUs from transcribed
sequences residing within primate-specific insertions in nine-species sequence
alignments and outside of segmental duplications. Exons of 120 (92 percent) of
the TUs contained interspersed repeats, indicating that repeat insertions may
have contributed to primate-specific gene genesis. Fifty-nine (46 percent)
primate-specific TUs may encode proteins, the researchers also found. Although
primate-specific TU transcript lengths were comparable to known human gene mRNA
lengths overall, 92 (70 percent) primate-specific TUs were single-exon.
Thirty-two (24 percent) primate-specific TUs were localized to subtelomeric and
pericentromeric regions. Forty (31 percent) of the TUs were nested in introns
of known genes, indicating that primate-specific TUs may arise within older,
protein-coding regions. Primate-specific TUs were preferentially expressed in
reproductive organs and tissues consistent with the expectation that emergence
of new, lineage-specific genes may accompany speciation or reproduction. Of the
33 primate-specific TUs with human Affymetrix microarray probe support, 21 were
differentially expressed in human teratozoospermia.
"This paper suggests that the emergence of primate-specific
and functional transcripts that due to de novo insertions, not arising from
duplication and subsequent accelerated sequence evolution," Tay says. "By
excluding segmental duplications often synonymous with gene genesis, we have
also shown that there exists single-copy transcripts which are also unique to
primates and presented initial evidence for function for these transcripts. For
example, 21 of our 131 primate-specific transcripts were found to be
differentially expressed in a separate study on severe teratozoospermia in men.
A comparison of our primate-specific transcripts with primate orphan genes
identified in a recent paper (Toll-Riera, et al.) shows no overlap—an
indication that the global primate-specific transcript catalog is far from
saturated and many primate-specific genes are still to be discovered."
The broader implication of the study is that not all genes
are necessarily conserved and protein-coding, Tay says.
"There are genes that are 'neither,' but they are interesting
because of their recent origin and possibly functional roles in reproduction
and behavior," he says. "Such genes need to be included in drug target screens,
RNA structure analyses, etc. We need to understand the mechanisms underlying
the birth of these insertions, especially their non-repetitive portions."
To accomplish that, researchers will now need to update the
set of primate-specific transcripts as new data becomes available, Tay says.
This will enable the researchers to confirm that such evolutionary novelties
are expressed, he adds.
"Additionally, there is a set of human transcripts which are
deleted in chimpanzees but conserved in the rhesus macaque, and possibly other
primate genomes," Tay says. "Such gene loss in the chimpanzees may also
contribute to the phenotypic differences between them and us."
The study may also serve as a paradigm of how research can
be conducted differently, Lipovich says.
"This is such an underrepresented area of research, and one
take-home message we have is that people should be looking at publicly
available data more," Lipovich says. "We established an generally applicable
paradigm for exploiting the union of two publicly available resources:
genome-wide sequence alignments and transcriptome data. Our approach was
unbiased in that it considered all publicly available human transcriptome data,
not just transcriptome data supporting already-known genes. Mapping this
transcriptome data onto multispecies genomic alignments enabled us to discover
primate-specific genes outside of annotated known genes."
Back |
Home |
FAQs |
Search |
Submit News Release |
Site Map |
About Us |
Advertising |
Resources |
Contact Us |
Terms & Conditions |
Privacy Policy
|