Motif enrichment using AME — runAme.list • memes

AME identifies known motifs (provided by the user) that are enriched in your input sequences.

# S3 method for list
runAme(
  input,
  control = "shuffle",
  outdir = "auto",
  method = "fisher",
  database = NULL,
  meme_path = NULL,
  sequences = FALSE,
  silent = TRUE,
  ...
)

# S3 method for BStringSetList
runAme(
  input,
  control = "shuffle",
  outdir = "auto",
  method = "fisher",
  database = NULL,
  meme_path = NULL,
  sequences = FALSE,
  silent = TRUE,
  ...
)

# S3 method for default
runAme(
  input,
  control = "shuffle",
  outdir = "auto",
  method = "fisher",
  database = NULL,
  meme_path = NULL,
  sequences = FALSE,
  silent = TRUE,
  ...
)

runAme(
  input,
  control = "shuffle",
  outdir = "auto",
  method = "fisher",
  database = NULL,
  meme_path = NULL,
  sequences = FALSE,
  silent = TRUE,
  ...
)

Arguments

input: path to fasta, or DNAstringset (optional: DNAStringSet object names contain fasta score, required for partitioning mode)
control: default: "shuffle", or set to DNAstringset or path to fasta file to use those sequences in discriminative mode. If input is a list of DNAStringSet objects, and control is set to a character vector of names in input, those regions will be used as background in discriminitive mode and AME will skip running on any control entries (NOTE: if input contains an entry named "shuffle" and control is set to "shuffle", it will use the input entry, not the AME shuffle algorithm). If control is a Biostrings::BStringSetList (generated using get_sequence), all sequences in the list will be combined as the control set. Set to NA for partitioning based on input fasta score (see get_sequence() for assigning fasta score). If input sequences are not assigned fasta scores but AME is run in partitioning mode, the sequence order will be used as the score.
outdir: Path to output directory location to save data files. If set to "auto", will use location of input files if passing file paths, otherwise will write to a temporary directory. default: "auto"
method: default: fisher (allowed values: fisher, ranksum, pearson, spearman, 3dmhg, 4dmhg)
database: path to .meme format file, universalmotif list object, dreme results data.frame, or list() of multiple of these. If objects are assigned names in the list, that name will be used as the database id in the output. It is highly recommended you set a name if not using a file path so the database name will be informative, otherwise the position in the list will be used as the database id. For example, if the input is: list("motifs.meme", list_of_motifs), the database id's will be: "motifs.meme" and "2". If the input is list("motifs.meme", "customMotifs" = list_of_motifs), the database id's will be "motifs.meme" and "customMotifs".
meme_path: path to "meme/bin/" (default: NULL). Will use default search behavior as described in check_meme_install() if unset.
sequences: logical(1) add results from sequences.tsv to sequences list column to returned data.frame. Valid only if method = "fisher". See AME outputs webpage for more information (Default: FALSE).
silent: whether to suppress stdout (default: TRUE), useful for debugging.
...: additional arguments passed to AME (see AME Flag table below)

Value

a data.frame of AME enrichment results. If run using a BStringsSetList input, returns a list of data.frames.

Details

AME can be run using several modes:

Discriminative mode: to discover motifs enriched relative to shuffled input, or a set of provided control sequences
Partitioning mode: in which AME uses some biological signal to identify the association between the signal and motif presence.

To read more about how AME works, see the AME Tutorial

Additional AME arguments

memes allows passing any valid flag to it's target programs via .... For additional details for all valid AME arguments, see the AME Manual webpage. For convenience, a table of valid parameters, and brief descriptions of their function are provided below:

AME Flag	allowed values	default	description
kmer	`integer(1)`	2	kmer frequency to preserve when shuffling control sequences
seed	`integer(1)`	1	seed for random number generator when shuffling control sequences
scoring	"avg", "max", "sum", "totalhits"	"avg"	Method for scoring a sequence for matches to a PWM (avg, max, sum, totalhits)
hit_lo_fraction	`numeric(1)`	0.25	fraction of hit log odds score to exceed to be considered a "hit"
evalue_report_threshold	`numeric(1)`	10	E-value threshold for reporting a motif as significantly enriched
fasta_threshold	`numeric(1)`	0.001	AME will classify sequences with FASTA scores below this value as positives. Only valid when `method = "fisher", poslist = "pwm", control = NA, fix_partition = NULL`.
fix_partition	`numeric(1)`	`NULL`	AME evaluates only the partition of the first N sequences. Only works when `control = NA` and `poslist = "fasta"`
poslist	"pwm", "fasta"	"fasta"	When using paritioning mode (`control = NA`), test thresholds on either PWM or Fasta score
log_fscores	`logical(1)`	FALSE	Convert FASTA scores into log-space (only used when `method = "pearson"`)
log_pwmscores	`logical(1)`	FALSE	Convert PWM scores into log-space (only used for `method = "pearson"` or `method = "spearman`)
lingreg_switchxy	`logical(1)`	FALSE	Make the x-points FASTA scores and y-points PWM scores (only used for `method = "pearson"` or `method = "spearman`)
xalph	file path	`NULL(1)`	alphabet file to use if input motifs are in different alphabet than input sequences
bfile	"motif", "motif-file", "uniform", path to file	`NULL`	source of 0-order background model. If "motif" or "motif-file" 0-order letter frequencies in the first motif file are used. If "uniform" uses uniform letter frequencies.
motif_pseudo	`numeric(1)`	0.1	Addd this pseudocount when converting from frequency matrix to log-odds matrix
inc	`character(1)`	`NULL`	use only motifs with names matching this regex
exc	`character(1)`	`NULL`	exclude motifs with names matching this regex

Citation

If you use runAme() in your analysis, please cite:

Robert McLeay and Timothy L. Bailey, "Motif Enrichment Analysis: A unified framework and method evaluation", BMC Bioinformatics, 11:165, 2010, doi:10.1186/1471-2105-11-165. full text

Licensing

The MEME Suite is free for non-profit use, but for-profit users should purchase a license. See the MEME Suite Copyright Page for details.

Examples

if (meme_is_installed()) {
# Create random named sequences as input for example
seqs <- universalmotif::create_sequences(rng.seed = 123)
names(seqs) <- seq_along(seqs)

# An example path to a motif database file in .meme format
motif_file <- system.file("extdata", "flyFactorSurvey_cleaned.meme", package = "memes")

runAme(seqs, database = motif_file)

# Dreme results dataset for example
dreme_xml <- system.file("extdata", "dreme.xml", package = "memes")
dreme_results <- importDremeXML(dreme_xml)

# database can be set to multiple values like so: 
runAme(seqs, database = list(motif_file, "my_dreme_motifs" = dreme_results))
}
#> AME detected no enrichment
#> AME detected no enrichment
#> NULL