TopHat2 [ Cole Trapnell]

IDENTIFY

Locate - Unique description

  • What is the software called ?
    • aTopHat2
  • What is a short description for this software ?
    • TopHat is a program that aligns RNA-Seq reads to a genome in order to identify exon-exon splice junctions. It is built on the ultrafast short read mapping program Bowtie. TopHat runs on Linux and OS X.
  • What are general categories (keywords, labels) for this software ?
    • Is there a project website for the software ?
    • [OPTIONAL] What is the DOI or any other unique identifier for this software (or software version) ?

      UNDERSTAND

      Trust - Quality and ratings

      • Who created this software? (Project, Organization, Person, Initiative, etc.)
        • Cole Trapnell
      • Are there any additional contributors of note for this software ?
        • Daehwan Kim
        • Steven Salzberg
      • What useful features of this software are worth highlighting ?
        • [OPTIONAL] Who is the publisher of this software if not the author ?
          • [OPTIONAL] How can a user get support for the software ? (eg. Report bugs, request features and extensions, etc)
            • [OPTIONAL] Has the software been adopted in a project, organization or by a person?
              • [OPTIONAL] Is there any information about uses of this software (papers, research labs, etc) ?
                • [OPTIONAL] Are there any statistics of its use ?
                  • [OPTIONAL] Are there any publications where the software is used ?
                    • [OPTIONAL] Is there any benchmark information about the software ?
                      • [OPTIONAL] What are the funding sources for this software?
                        • [OPTIONAL] What are the ratings for this software?

                          Relate - Domain knowledge

                          • What are domain specific keywords for this software ? (eg: hydrology, climate)
                            • [OPTIONAL] Is there any other similar software that you know of ?
                              • [OPTIONAL] What are the recommended uses and assumptions for the software ?
                                • [OPTIONAL] Are there any constraints on use, situations it is not designed for, simplifications ?

                                  EXECUTE

                                  Access - Download

                                  Install - Execution requirements

                                  • Is there any on-line documentation about the software ?
                                    • What language(s) is the software written in ?
                                      • What Operating Systems can the software run on ?
                                        • How can one install the software ?
                                          • What other software does the software require to be installed ?
                                            • [OPTIONAL] Are there estimates of how long it takes to run this software on average ?
                                              • [OPTIONAL] Are there any memory requirements for this software ?
                                                • [OPTIONAL] Are there any other important details about the implementation of this code (parallelization, special hardware, etc) ?

                                                  Run - Testing execution

                                                  • Is there any test data available for the software ?
                                                    • [OPTIONAL] Are there any specific instructions for testing the software ?

                                                      DO RESEARCH

                                                      Experiment - Run with other data

                                                      • What input files does the software require ?
                                                          • File Id: 
                                                            inputZippedFASTQ
                                                          • File Type: 
                                                            1 zip file for 1 component-run. >1 Fastq files can exist for 1 sample and need to be analyzed together. So we zip the files so we have 1:1 mapping for files - sample.
                                                          • File Id: 
                                                            refFastaBundle
                                                          • File Type: 
                                                            Zip file with bowtie2 indices and fasta files. The indices and fasta file, fasts-index file need to be in the same directory for Tophat2 to run smoothly
                                                          • File Id: 
                                                            tophat2Params
                                                          • File Type: 
                                                            json/xml file with TopHat2 parameters are input. There are way too many parameters. Adding each parameter to the component might make it look very messy.
                                                      • What are the input parameters used for this software?
                                                        • What output files does the software produce ?
                                                            • File Id: 
                                                              alignedBAM
                                                            • File Type: 
                                                              Fastq file aligned with reference human genome results in BAM files (binary alignment format). The aligned/mapped file contains sequences/alignments that were successfully aligned to the human reference genome
                                                            • File Id: 
                                                              unmappedBAM
                                                            • File Type: 
                                                              Any sequences that were not mapped to human reference genome are dumped into unmappedBAM by bowtie2 and tophat2.
                                                            • File Id: 
                                                              junBED
                                                            • File Type: 
                                                              Track of junctions reported by TopHat2. Specific to splice junctions for RNASeq file
                                                            • File Id: 
                                                              insertionsBED
                                                            • File Type: 
                                                              Track of insertions (mutations) reported by TopHat2
                                                            • File Id: 
                                                              deletionsBED
                                                            • File Type: 
                                                              Track of deletions (mutations) reported by TopHat2
                                                        • [OPTIONAL] Are there any relevant data catalogs that can be used with this software ?

                                                          Compose - Run with other software

                                                          • What other software can interoperate with this one?
                                                            • [OPTIONAL] Is this software typically used with other software in a workflow ? (eg: for visualization, preprocessing, postprocessing, etc)

                                                              Cite - Scientific publications

                                                              • Is there a preferred publication or citation for this software ?

                                                              GET SUPPORT

                                                              Discuss - Support and community

                                                              • What is the e-mail contact for this software?
                                                                • [OPTIONAL] What is the support offered for this software?

                                                                  UPDATE

                                                                  Contribute - Evolution

                                                                  • [OPTIONAL] How is the software being developed or maintained ?
                                                                    • [OPTIONAL] Are there any on-line resources for accessing the developer community for this software ? (eg. discussion board, wiki, etc)

                                                                      Track - Versions

                                                                      • What versions does the software have ?