AME identifies known motifs (provided by the user) that are enriched in your input sequences.
# S3 method for list
runAme(
input,
control = "shuffle",
outdir = "auto",
method = "fisher",
database = NULL,
meme_path = NULL,
sequences = FALSE,
silent = TRUE,
...
)
# S3 method for BStringSetList
runAme(
input,
control = "shuffle",
outdir = "auto",
method = "fisher",
database = NULL,
meme_path = NULL,
sequences = FALSE,
silent = TRUE,
...
)
# S3 method for default
runAme(
input,
control = "shuffle",
outdir = "auto",
method = "fisher",
database = NULL,
meme_path = NULL,
sequences = FALSE,
silent = TRUE,
...
)
runAme(
input,
control = "shuffle",
outdir = "auto",
method = "fisher",
database = NULL,
meme_path = NULL,
sequences = FALSE,
silent = TRUE,
...
)
path to fasta, or DNAstringset (optional: DNAStringSet object names contain fasta score, required for partitioning mode)
default: "shuffle", or set to
DNAstringset or path to fasta file to use those sequences in discriminative
mode. If input
is a list of DNAStringSet objects, and control
is set to
a character vector of names in input
, those regions will be used as
background in discriminitive mode and AME will skip running on any
control
entries (NOTE: if input
contains an entry named "shuffle" and
control is set to "shuffle", it will use the input
entry, not the AME
shuffle algorithm). If control
is a Biostrings::BStringSetList (generated
using get_sequence
), all sequences in the list will be combined as the
control set. Set to NA
for partitioning based on input fasta score (see
get_sequence()
for assigning fasta score). If input sequences are not
assigned fasta scores but AME is run in partitioning mode, the sequence
order will be used as the score.
Path to output directory location to save data files. If set to "auto", will use location of input files if passing file paths, otherwise will write to a temporary directory. default: "auto"
default: fisher (allowed values: fisher, ranksum, pearson, spearman, 3dmhg, 4dmhg)
path to .meme format file, universalmotif list object, dreme results data.frame, or list() of multiple of these. If objects are assigned names in the list, that name will be used as the database id in the output. It is highly recommended you set a name if not using a file path so the database name will be informative, otherwise the position in the list will be used as the database id. For example, if the input is: list("motifs.meme", list_of_motifs), the database id's will be: "motifs.meme" and "2". If the input is list("motifs.meme", "customMotifs" = list_of_motifs), the database id's will be "motifs.meme" and "customMotifs".
path to "meme/bin/" (default: NULL
). Will use default
search behavior as described in check_meme_install()
if unset.
logical(1)
add results from sequences.tsv
to sequences
list column to returned data.frame. Valid only if method = "fisher". See
AME outputs
webpage for more information (Default: FALSE).
whether to suppress stdout (default: TRUE), useful for debugging.
additional arguments passed to AME (see AME Flag table below)
a data.frame of AME enrichment results. If run using a BStringsSetList input, returns a list of data.frames.
AME can be run using several modes:
Discriminative mode: to discover motifs enriched relative to shuffled input, or a set of provided control sequences
Partitioning mode: in which AME uses some biological signal to identify the association between the signal and motif presence.
To read more about how AME works, see the AME Tutorial
Additional AME arguments
memes allows passing any valid flag to it's target programs via ...
. For
additional details for all valid AME arguments, see the AME Manual webpage. For convenience, a table
of valid parameters, and brief descriptions of their function are provided
below:
AME Flag | allowed values | default | description |
kmer | integer(1) | 2 | kmer frequency to preserve when shuffling control sequences |
seed | integer(1) | 1 | seed for random number generator when shuffling control sequences |
scoring | "avg", "max", "sum", "totalhits" | "avg" | Method for scoring a sequence for matches to a PWM (avg, max, sum, totalhits) |
hit_lo_fraction | numeric(1) | 0.25 | fraction of hit log odds score to exceed to be considered a "hit" |
evalue_report_threshold | numeric(1) | 10 | E-value threshold for reporting a motif as significantly enriched |
fasta_threshold | numeric(1) | 0.001 | AME will classify sequences with FASTA scores below this value as positives. Only valid when method = "fisher", poslist = "pwm", control = NA, fix_partition = NULL . |
fix_partition | numeric(1) | NULL | AME evaluates only the partition of the first N sequences. Only works when control = NA and poslist = "fasta" |
poslist | "pwm", "fasta" | "fasta" | When using paritioning mode (control = NA ), test thresholds on either PWM or Fasta score |
log_fscores | logical(1) | FALSE | Convert FASTA scores into log-space (only used when method = "pearson" ) |
log_pwmscores | logical(1) | FALSE | Convert PWM scores into log-space (only used for method = "pearson" or method = "spearman ) |
lingreg_switchxy | logical(1) | FALSE | Make the x-points FASTA scores and y-points PWM scores (only used for method = "pearson" or method = "spearman ) |
xalph | file path | NULL(1) | alphabet file to use if input motifs are in different alphabet than input sequences |
bfile | "motif", "motif-file", "uniform", path to file | NULL | source of 0-order background model. If "motif" or "motif-file" 0-order letter frequencies in the first motif file are used. If "uniform" uses uniform letter frequencies. |
motif_pseudo | numeric(1) | 0.1 | Addd this pseudocount when converting from frequency matrix to log-odds matrix |
inc | character(1) | NULL | use only motifs with names matching this regex |
exc | character(1) | NULL | exclude motifs with names matching this regex |
If you use runAme()
in your analysis, please cite:
Robert McLeay and Timothy L. Bailey, "Motif Enrichment Analysis: A unified framework and method evaluation", BMC Bioinformatics, 11:165, 2010, doi:10.1186/1471-2105-11-165. full text
The MEME Suite is free for non-profit use, but for-profit users should purchase a license. See the MEME Suite Copyright Page for details.
if (meme_is_installed()) {
# Create random named sequences as input for example
seqs <- universalmotif::create_sequences(rng.seed = 123)
names(seqs) <- seq_along(seqs)
# An example path to a motif database file in .meme format
motif_file <- system.file("extdata", "flyFactorSurvey_cleaned.meme", package = "memes")
runAme(seqs, database = motif_file)
# Dreme results dataset for example
dreme_xml <- system.file("extdata", "dreme.xml", package = "memes")
dreme_results <- importDremeXML(dreme_xml)
# database can be set to multiple values like so:
runAme(seqs, database = list(motif_file, "my_dreme_motifs" = dreme_results))
}
#> AME detected no enrichment
#> AME detected no enrichment
#> NULL