AME identifies known motifs (provided by the user) that are enriched in your input sequences.

# S3 method for list
runAme(
  input,
  control = "shuffle",
  outdir = "auto",
  method = "fisher",
  database = NULL,
  meme_path = NULL,
  sequences = FALSE,
  silent = TRUE,
  ...
)

# S3 method for BStringSetList
runAme(
  input,
  control = "shuffle",
  outdir = "auto",
  method = "fisher",
  database = NULL,
  meme_path = NULL,
  sequences = FALSE,
  silent = TRUE,
  ...
)

# S3 method for default
runAme(
  input,
  control = "shuffle",
  outdir = "auto",
  method = "fisher",
  database = NULL,
  meme_path = NULL,
  sequences = FALSE,
  silent = TRUE,
  ...
)

runAme(
  input,
  control = "shuffle",
  outdir = "auto",
  method = "fisher",
  database = NULL,
  meme_path = NULL,
  sequences = FALSE,
  silent = TRUE,
  ...
)

Arguments

input

path to fasta, or DNAstringset (optional: DNAStringSet object names contain fasta score, required for partitioning mode)

control

default: "shuffle", or set to DNAstringset or path to fasta file to use those sequences in discriminative mode. If input is a list of DNAStringSet objects, and control is set to a character vector of names in input, those regions will be used as background in discriminitive mode and AME will skip running on any control entries (NOTE: if input contains an entry named "shuffle" and control is set to "shuffle", it will use the input entry, not the AME shuffle algorithm). If control is a Biostrings::BStringSetList (generated using get_sequence), all sequences in the list will be combined as the control set. Set to NA for partitioning based on input fasta score (see get_sequence() for assigning fasta score). If input sequences are not assigned fasta scores but AME is run in partitioning mode, the sequence order will be used as the score.

outdir

Path to output directory location to save data files. If set to "auto", will use location of input files if passing file paths, otherwise will write to a temporary directory. default: "auto"

method

default: fisher (allowed values: fisher, ranksum, pearson, spearman, 3dmhg, 4dmhg)

database

path to .meme format file, universalmotif list object, dreme results data.frame, or list() of multiple of these. If objects are assigned names in the list, that name will be used as the database id in the output. It is highly recommended you set a name if not using a file path so the database name will be informative, otherwise the position in the list will be used as the database id. For example, if the input is: list("motifs.meme", list_of_motifs), the database id's will be: "motifs.meme" and "2". If the input is list("motifs.meme", "customMotifs" = list_of_motifs), the database id's will be "motifs.meme" and "customMotifs".

meme_path

path to "meme/bin/" (default: NULL). Will use default search behavior as described in check_meme_install() if unset.

sequences

logical(1) add results from sequences.tsv to sequences list column to returned data.frame. Valid only if method = "fisher". See AME outputs webpage for more information (Default: FALSE).

silent

whether to suppress stdout (default: TRUE), useful for debugging.

...

additional arguments passed to AME (see AME Flag table below)

Value

a data.frame of AME enrichment results. If run using a BStringsSetList input, returns a list of data.frames.

Details

AME can be run using several modes:

  1. Discriminative mode: to discover motifs enriched relative to shuffled input, or a set of provided control sequences

  2. Partitioning mode: in which AME uses some biological signal to identify the association between the signal and motif presence.

To read more about how AME works, see the AME Tutorial

Additional AME arguments

memes allows passing any valid flag to it's target programs via .... For additional details for all valid AME arguments, see the AME Manual webpage. For convenience, a table of valid parameters, and brief descriptions of their function are provided below:

AME Flagallowed valuesdefaultdescription
kmerinteger(1)2kmer frequency to preserve when shuffling control sequences
seedinteger(1)1seed for random number generator when shuffling control sequences
scoring"avg", "max", "sum", "totalhits""avg"Method for scoring a sequence for matches to a PWM (avg, max, sum, totalhits)
hit_lo_fractionnumeric(1)0.25fraction of hit log odds score to exceed to be considered a "hit"
evalue_report_thresholdnumeric(1)10E-value threshold for reporting a motif as significantly enriched
fasta_thresholdnumeric(1)0.001AME will classify sequences with FASTA scores below this value as positives. Only valid when method = "fisher", poslist = "pwm", control = NA, fix_partition = NULL.
fix_partitionnumeric(1)NULLAME evaluates only the partition of the first N sequences. Only works when control = NA and poslist = "fasta"
poslist"pwm", "fasta""fasta"When using paritioning mode (control = NA), test thresholds on either PWM or Fasta score
log_fscoreslogical(1)FALSEConvert FASTA scores into log-space (only used when method = "pearson")
log_pwmscoreslogical(1)FALSEConvert PWM scores into log-space (only used for method = "pearson" or method = "spearman)
lingreg_switchxylogical(1)FALSEMake the x-points FASTA scores and y-points PWM scores (only used for method = "pearson" or method = "spearman)
xalphfile pathNULL(1)alphabet file to use if input motifs are in different alphabet than input sequences
bfile"motif", "motif-file", "uniform", path to fileNULLsource of 0-order background model. If "motif" or "motif-file" 0-order letter frequencies in the first motif file are used. If "uniform" uses uniform letter frequencies.
motif_pseudonumeric(1)0.1Addd this pseudocount when converting from frequency matrix to log-odds matrix
inccharacter(1)NULLuse only motifs with names matching this regex
exccharacter(1)NULLexclude motifs with names matching this regex

Citation

If you use runAme() in your analysis, please cite:

Robert McLeay and Timothy L. Bailey, "Motif Enrichment Analysis: A unified framework and method evaluation", BMC Bioinformatics, 11:165, 2010, doi:10.1186/1471-2105-11-165. full text

Licensing

The MEME Suite is free for non-profit use, but for-profit users should purchase a license. See the MEME Suite Copyright Page for details.

Examples

if (meme_is_installed()) {
# Create random named sequences as input for example
seqs <- universalmotif::create_sequences(rng.seed = 123)
names(seqs) <- seq_along(seqs)

# An example path to a motif database file in .meme format
motif_file <- system.file("extdata", "flyFactorSurvey_cleaned.meme", package = "memes")

runAme(seqs, database = motif_file)

# Dreme results dataset for example
dreme_xml <- system.file("extdata", "dreme.xml", package = "memes")
dreme_results <- importDremeXML(dreme_xml)

# database can be set to multiple values like so: 
runAme(seqs, database = list(motif_file, "my_dreme_motifs" = dreme_results))
}
#> AME detected no enrichment
#> AME detected no enrichment
#> NULL