TopHat is a program that aligns RNA-Seq reads to a genome in order to identify exon-exon splice junctions. It is built on the ultrafast short read mapping program Bowtie. TopHat runs on Linux and OS X.
What are general categories (keywords, labels) for this software ?
Is there any on-line documentation about the software ?
What language(s) is the software written in ?
What Operating Systems can the software run on ?
How can one install the software ?
What other software does the software require to be installed ?
[OPTIONAL] Are there estimates of how long it takes to run this software on average ?
[OPTIONAL] Are there any memory requirements for this software ?
[OPTIONAL] Are there any other important details about the implementation of this code (parallelization, special hardware, etc) ?
Run - Testing execution
Is there any test data available for the software ?
[OPTIONAL] Are there any specific instructions for testing the software ?
Experiment - Run with other data
What input files does the software require ?
1 zip file for 1 component-run. >1 Fastq files can exist for 1 sample and need to be analyzed together. So we zip the files so we have 1:1 mapping for files - sample.
Zip file with bowtie2 indices and fasta files. The indices and fasta file, fasts-index file need to be in the same directory for Tophat2 to run smoothly
json/xml file with TopHat2 parameters are input. There are way too many parameters. Adding each parameter to the component might make it look very messy.
What are the input parameters used for this software?
What output files does the software produce ?
Fastq file aligned with reference human genome results in BAM files (binary alignment format). The aligned/mapped file contains sequences/alignments that were successfully aligned to the human reference genome
Any sequences that were not mapped to human reference genome are dumped into unmappedBAM by bowtie2 and tophat2.
Track of junctions reported by TopHat2. Specific to splice junctions for RNASeq file
Track of insertions (mutations) reported by TopHat2
Track of deletions (mutations) reported by TopHat2
[OPTIONAL] Are there any relevant data catalogs that can be used with this software ?
Compose - Run with other software
What other software can interoperate with this one?
[OPTIONAL] Is this software typically used with other software in a workflow ? (eg: for visualization, preprocessing, postprocessing, etc)
Cite - Scientific publications
Is there a preferred publication or citation for this software ?
Kim, D., Pertea, G., Trapnell, C., Pimentel, H., Kelley, R. and Salzberg S. “TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions”. Ge-nome Biology. 14:R36. 2013.