TomTom compares input motifs to a database of known, user-provided motifs to identify matches.

runTomTom(
  input,
  database = NULL,
  outdir = "auto",
  thresh = 10,
  min_overlap = 5,
  dist = "ed",
  evalue = TRUE,
  silent = TRUE,
  meme_path = NULL,
  ...
)

Arguments

input

path to .meme format file of motifs, a list of universalmotifs, or a universalmotif data.frame object (such as the output of runDreme())

database

path to .meme format file to use as reference database (or list of universalmotifs). NOTE: p-value estimates are inaccurate when the database has fewer than 50 entries.

outdir

directory to store tomtom results (will be overwritten if exists). Default: location of input fasta file, or temporary location if using universalmotif input.

thresh

report matches less than or equal to this value. If evalue = TRUE (default), set an e-value threshold (default = 10). If evalue = FALSE, set a value between 0-1 (default = 0.5).

min_overlap

only report matches that overlap by this value or more, unless input motif is shorter, in which case the shorter length is used as the minimum value

dist

distance metric. Valid arguments: allr | ed | kullback | pearson | sandelin | blic1 | blic5 | llr1 | llr5. Default: ed (euclidean distance).

evalue

whether to use E-value as significance threshold (default: TRUE). If evalue = FALSE, uses q-value instead.

silent

suppress printing stderr to console (default: TRUE).

meme_path

path to "meme/bin/" (optional). If unset, will check R environment variable "MEME_DB (set in .Renviron), or option "meme_db" (set with option(meme_db = "path/to/meme/bin"))

...

additional flags passed to tomtom using cmdfun formating (see table below for details)

Value

data.frame of match results. Contains best_match_motif column of universalmotif objects with the matched PWM from the database, a series of best_match_* columns describing the TomTom results of the match, and a tomtom list column storing the ranked list of possible matches to each motif. If a universalmotif data.frame is used as input, these columns are appended to the data.frame. If no matches are returned, tomtom and best_match_motif columns will be set to NA and a message indicating this will print.

Details

runTomTom will rank matches by significance and return a best match motif for each input (whose properties are stored in the best_match_* columns) as well as a ranked list of all possible matches stored in the tomtom list column.

Additional arguments

runTomTom() can accept all valid tomtom arguments passed to ... as described in the tomtom commandline reference. For convenience, below is a table of valid arguments, their default values, and their description.

TomTom Flagallowed valuesdefaultdescription
bfilefile pathNULLpath to background model for converting frequency matrix to log-odds score (not used when dist is set to "ed", "kullback", "pearson", or "sandelin"
motif_pseudonumeric0.1pseudocount to add to motifs
xalphlogicalFALSEconvert alphabet of target database to alphabet of query database
norclogicalFALSEDo not score reverse complements of motifs
incomplete_scoreslogicalFALSECompute scores using only aligned columns
threshnumeric0.5only report matches with significance values <= this value. Unless evalue = TRUE, this value must be < 1.
internallogicalFALSEforces the shorter motif to be completely contained in the longer motif
min_overlapinteger1only report matches that overlap by this number of positions or more. If query motif is smaller than this value, its width is used as the min overlap for that query
timeintegerNULLMaximum runtime in CPU seconds (default: no limit)

Citation

If you use runTomTom() in your analysis, please cite:

Shobhit Gupta, JA Stamatoyannopolous, Timothy Bailey and William Stafford Noble, "Quantifying similarity between motifs", Genome Biology, 8(2):R24, 2007. full text

Licensing

The MEME Suite is free for non-profit use, but for-profit users should purchase a license. See the MEME Suite Copyright Page for details.

Examples

if (meme_is_installed()) {
motif <- universalmotif::create_motif("CCRAAAW")
database <- system.file("extdata", "flyFactorSurvey_cleaned.meme", package = "memes")

runTomTom(motif, database)
}
#>         motif  name consensus alphabet strand icscore type pseudocount
#> 1 <mot:motif> motif   CCRAAAW      DNA     +-      12  PPM           0
#>                      bkg  best_match_name best_match_altname
#> 1 0.25, 0.25, 0.25, 0.25 Eip93F_SANGER_10             Eip93F
#>              best_db_name best_match_offset best_match_pval best_match_eval
#> 1 flyFactorSurvey_cleaned                 4        1.91e-07        0.000106
#>   best_match_qval best_match_strand
#> 1        0.000213                 +
#>                                                       best_match_motif
#> 1 <S4 class ‘universalmotif’ [package “universalmotif”] with 20 slots>
#>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   tomtom
#> 1 Eip93F_SANGER_10, rib_SANGER_5, Ets65A_SANGER_10, CG12768_SANGER_5, Eip93F, rib, Ets65A, CG12768, <S4 class ‘universalmotif’ [package “universalmotif”] with 20 slots>, <S4 class ‘universalmotif’ [package “universalmotif”] with 20 slots>, <S4 class ‘universalmotif’ [package “universalmotif”] with 20 slots>, <S4 class ‘universalmotif’ [package “universalmotif”] with 20 slots>, flyFactorSurvey_cleaned, flyFactorSurvey_cleaned, flyFactorSurvey_cleaned, flyFactorSurvey_cleaned, 4, 1, 0, 5, 1.91e-07, 0.0015, 0.0126, 0.015, 0.000106, 0.832, 7, 8.36, 0.000213, 0.832, 1, 1, +, +, +, +
#> 
#> [Hidden empty columns: altname, family, organism, nsites, bkgsites,
#>   pval, qval, eval.]