

The Paired-End Analysis of Transcription start sites (PEAT) protocol enriches for capped transcripts in two steps: an initial dephosphorylation of uncapped transcripts, followed by the ligation of a short “tag” sequence to the 5′ ends of capped transcripts. Several overall strategies have been published for application to animal tissues and cell lines these are well-characterized by three protocols. While standard RNA-Seq methodologies can provide some insight into the nature of full-length transcripts, a heavy 3′ bias in data outcomes must be addressed with a 5′ cap-trapping strategy. NanoCAGE-XL together with CapFilter allows for genome wide identification of high confidence transcription start sites in large eukaryotic genomes.Īccurate identification of transcription start sites (TSS) for RNA polymerase II (pol-II) genes is critical for determining promoter location, which in turn facilitates accurate determination of functional sequence control elements. We report excellent gene coverage, reproducibility, and precision in transcription start site discovery for samples from Arabidopsis thaliana roots. Here we present the first publicly available adaptation of nanoCAGE for sequencing on recent ultra-high throughput platforms such as Illumina HiSeq-2000, and CapFilter, a computational pipeline that greatly increases confidence in TSS identification. More recently, nanoCAGE was developed for sequencing on the Illumina GAIIx to overcome these difficulties. Several protocols have been developed to capture the 5′ end of transcripts via Cap Analysis of Gene Expression (CAGE) or linker-ligation strategies such as Paired-End Analysis of Transcription Start Sites (PEAT), but often require large amounts of tissue.



Identifying the transcription start sites (TSS) of genes is essential for characterizing promoter regions.
